Course Outline

Day 1: Data Processing and Python Essentials 

Session 1: Spark DataFrames and Basic Operations 

  • Working with Spark DataFrames Implementing Basic Operations 
  • Groupby and Aggregate Operations 
  • Handling Timestamps and Dates 
  • Hands-on Exercise: Data analysis using Spark DataFrames 

Session 2: Python Programming for Big Data 

  • Core Python for Data Handling Using Variables, Lists, and Functions 
  • Working with Classes and Files 
  • Integrating APIs and External Data 
  • Hands-on Exercise: Building a Python project that processes and analyzes data with PySpark 

Day 2: Advanced PySpark and Machine Learning 

Session 3: Machine Learning with PySpark 

  • Implementing Machine Learning with Spark MLlib Linear and Logistic Regression 
  • Random Forest Classification Models 
  • Hands-on Exercise: Building and evaluating machine learning models using PySpark 

Session 4: Clustering and Recommender Systems 

  • K-means Clustering Theory and Practical Implementation 
  • Hands-on Exercise: Building a K-means clustering model 
  • Recommender Systems Building a recommendation engine with Spark MLlib 
  • Hands-on Exercise: Recommender system project 

Session 5: Spark Streaming and NLP 

  • Real-Time Data Streaming with Spark Implementing real-time data processing 
  • Hands-on Exercise: Streaming data with Spark 
  • Natural Language Processing (NLP) with PySpark Implementing basic NLP tasks 
  • Hands-on Exercise: NLP pipeline using PySpark 

Requirements

Python is a high-level programming language famous for its clear syntax and code readibility. Spark is a data processing engine used in querying, analyzing, and transforming big data. PySpark allows users to interface Spark with Python.

Target Audience: Intermediate-level professionals in the banking industry familiar with Python and Spark, seeking to deepen their skills in big data processing and machine learning. 

 14 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses (Minimal 5 peserta)

Related Categories