Solutions

Training Let’s Build Long‑lasting Data Solutions Together

Training

Talend hands-on training by YDATAS

(SME and ex-Talend)

About the hands-on Training

Talend provides a development environment that lets you interact with many source and target without having to learn and write complicated code. The course will cover both on-premise and cloud offerings of Talend Product suites (as per client training/consulting needs).

 

The client can handpick topics from below modules and the training can be tailored accordingly

M1:

This course covers basic to advance data integration jobs (beginner, medium, advanced) design, build, deployment using industry best practices. Covers on-premise Enterprise Architecture and recommendation for setup/hand-holding.

M2:

Talend Cloud adoption or migration from Talend on-premise product (TAC) to Talend Cloud (PAAS/SAAS) offerings. Initial setup and handholding, deployments, CI-CD, Scheduling.

M3:

Data Quality / Cleansing / Profiling and integration with Talend core data integration framework. Learn how to evaluate data quality according to a set of metrics and thresholds based on indicators, models, and rules for each data item to be analyzed or monitored.

M4:

Talend Data Catalogue: Business Glossary, Lineage, How the product works with talend setup in tech stack.It connects data from platforms, databases, and analytics tools to generate a holistic view of the information supply chain in a language that everyone can understand.

M5:

Talend ESB, API (REST/SOAP) call, extraction of data focusing on XML/JSON structures, Routes, Scheduling, Realtime data extraction, configuration.ESB Conductor, ESB Publisher, and ESB Runtime with Routes and data services.

M6:

Data Governance: Data Preparation for Analysts/Developers (how to seamlessly use talend data preparation tool) and its integration (recipe) back to main integration jobs (automation). Create datasets and preparations to deliver cleansed, structured, enriched data to business users.

Data stewardship:

Data cleansing activity(automation with overall framework), user-based access control (gate keeping), creation of data models, campaigns, and tasks

N.B: All the modules are independent and can be chosen based on need. Tailoring of topics are available as per need.

Prerequisites:

Basic familiarity with Java or another programming language, SQL, and general database concepts. Awareness of ETL tool but not a showstopper to learn Talend.

Detailed Course Contents

Day 1

Overview on Talend Open Studio and Enterprise Architecture – In depth if needed.

Overview on Talend and associated components.

Link Talend Studio to your Talend account, registering a new account if necessary

Start Talend Open Studio for Data Integration

Create a Talend project to contain tasks

Create a Talend Job to perform a specific task

Add and configure components to handle data input, data transformation, and data output

Run a Talend Job and examine the results

Build a visual model of a Talend Job or project

Source and Target systems (read and write) Delimited, Positional, XML, Excel, Database (MySQL) , JSON(on demand) , Advanced XML(on demand), Unstructured (on demand)

Copy an existing Job as the basis for a new Job

Store configuration information centrally for use in multiple components

Extend data from one source with data extracted from a second source

Log data rows in the console rather than storing them

Day 2

Troubleshoot a join by examining failed lookups

Use components to filter data

Generate sample data rows

Execute Job sections conditionally

Duplicate output flows

Create a schema for use in multiple components

Create variables for component configuration parameters

Run a Job to access specific values for the variables

Employ mechanisms to kill a Job under specific circumstances

Include Job elements that change the behavior based on the success or failure of individual components or subjobs

Deepdive into SubJobs and their relevance with respect to scenarios

Filter unique data rows

Perform aggregate calculations on rows

Use components to create an archive and delete files

Add comments to document a Job and its components

Generate HTML documentation for a Job

Export a Job

Run an exported Job independently of Talend Open Studio

Create a new version of an existing Job

Day 3

Context (parameterization): Basic to Advanced scenarios covering 7 levels of Context.

Triggers : all 5 types of triggers with job scenarios

How Jobs pass values between different subjobs

How Component exchange information during run time

How to configure runtime servers

Advance error handling and debugging tips and tricks

Reusability: Program/Function, Component, Job : at all levels

Day 4

Job design, build best practices.

Tips on Jobs Performance tuning techniques.

Error handling frameworks, auditing frameworks : overview and build approaches.

Error decoding , debugging techniques.

Root cause analysis techniques.

Day 5

Overview of Talend Administration Console(subjected to lab setup as the availability only comes with enterprise licenses)

Talend admin console basic, how to schedule job , create TASK , create Trigger and monitor job

Create Servers , Virtual Job Servers.

How to clean the Logs from TAC, Memory configuration on TAC : Configuration Management

Parameterizing a Talend Job from command line and from TAC (might also get covered under levels of contexts in advance data integration).

Execution Plans Vs TASKS.

Scheduling.

Monitoring.

Authorization and users.

Pending Q&A session (in scope topics only)

Day 6

Talend Cloud:

Job Build in Studio and publish/promotion to TMC (Talend Management Console).

Version control and best practices.

TMC: Management

Tasks, Environments, Workspaces: In depth concepts, hands on.

CI-CD: deployment pipelines.

Cloud Engines Vs Remote Engines: when to use what and why they are needed?

Job Server/Execution Server/Remote Engines: How to configure/setup in AWS or AZURE or GCP environments.

Virtual Server Setup.

Execution Plan, scheduling: setup/hands on.

Email Alerting.

User Access control and administration activity for TMC admin.

How Talend Cloud Licensing works ?

Day 7:

Talend configuration and hands on scenario (case by case) : Cloud Interfaces

  1. Integration with AWS: S3 and Redshift
  2. Integration with Azure: Blob storage, ADLS, SQL DB.
  3. Integration with GCP: cloud storage and Big Query

Integration with Snowflake:

Data loading, extracting following both talend and snowflake methods.

Snowflake bulk utility.

Best practices managing data warehouses for optimal cost/credit management.

Programmatically (graphically) managing the snowflake from talend GUI Studio designs.

Integration with Salesforce:

Both REST API and Bulk API: Hands On.

Day 8

Data Cataloging:

  • Understand the role of data and metadata management
  • Explore the Talend Data catalog graphical interface
  • Analyze data impact and trace data lineage
  • Create users and groups
  • Harvest metadata
  • Sample and profile data
  • Stitch metadata to build a configuration
  • Add a business glossary to a model
  • Import and apply semantic types

Day 9:

Data Quality

Data quality analysis and matching

  • structural analyses
  • Performing a basic column analysis
  • Adding regular expressions
  • Defining indicator thresholds
  • Using a column set analysis
  • Using a business rule analysis

Advanced matching

  • Using a matching integration Job
  • Deduplicating addresses

Cleansing& Privacy

  • Cleansing email addresses
  • Shuffling data for privacy
  • Masking data for privacy

N.B:- End to End Business Case project will be covered which will cover most of the fundamentals and knowledge learned during the session.

N.B: The sessions are more inclined towards hands-on and cover industry use cases. The practice is for more practical and less theory.

Get in Touch

Let’s talk. We will get in touch within 1 business day. No obligation and we don’t share your data with anyone