Apache Cassandra

Learn via : Virtual Classroom / Online
Duration : 3 Days
  1. Home
  2. /
  3. Apache Cassandra

Description

    Apache Cassandra is a free, open-source project and a second-generation distributed NoSQL database and is considered to be the best choice for high availability and scalability databases, particularly when dealing with large amounts of data. Cassandra supports replication across multiple datacenters, while also making the write and read processes highly scalable by offering tunable consistency. This Apache Cassandra training course will provide you with an overview of the fundamentals of Big Data and NoSQL databases, an understanding of Cassandra and its features, architecture and data model, its role in the Hadoop Big Data ecosystem, and show you how to install, configure and monitor Cassandra.

    The large volume and variety of data that today’s businesses process require the need for a highly available, low latency database. Apache Cassandra provides this solution by permitting high-speed reads and writes across a replicated, distributed system. This Apache Cassandra training course provides data modeling experience to take advantage of the linearly scalable peer-to-peer design of Cassandra.

     

    Delegates  will learn how to

    • Architect Cassandra databases and implement commonly used design patterns
    • Model data in Cassandra based on query patterns
    • Access Cassandra databases using CQL and Java
    • Create a balance between read/write speed and data consistency
    • Integrate Cassandra with Hadoop, Pig, and Hive

     

    Audience

    Professionals aspiring for a career in NoSQL databases and Cassandra

    ·       Analytics professionals

    ·       Research professionals

    ·       IT developers

    ·       Testers

    ·       Project managers

     

    Prerequisites

    ·       Knowledge of databases and SQL

    ·       Java programming


Outline

NoSQL Overview

  • Justifying non-relational data stores
  • Listing the categories of NoSQL Data Stores

Exploring Cassandra

  • Defining column family data stores
  • Surveying Cassandra
  • Dissecting the basic Cassandra architecture

Querying Cassandra

  • Defining Cassandra Query Language, CQL
  • Enumerating CQL data types
  • Manipulating data from the cqlsh interface

 Leveraging Cassandra structures and types

  • Drawing comparisons with the relational model
  • Organizing data with keyspaces, tables and columns
  • Creating collections and counters

Modeling data based on queries

  • Designing tables around access patterns
  • Clustering with compound primary keys
  • Improving data distribution with composite partition Keys

Detailing tunable consistency

  • Identifying consistency levels
  • Selecting appropriate read and write consistency levels
  • Distinguishing consistency repair features

Balancing consistency and performance

  • Relating replication factor and consistency
  • Trading consistency for availability
  • Achieving linearizable consistency with Compare-And-Set

Working with Cassandra collection types

  • Grouping elements in sets
  • Ordering elements in lists
  • Expressing relationships with maps
  • Nesting collections

Storing data for easy retrieval

  • Mapping data to tuples and user defined types
  • Investigating the frozen keyword
  • Applying the Valueless Columns Pattern
  • Strategic implementation of clustering columns

Controlling data life span

  • Expiring temporal data with time-to-live
  • Reviewing how tombstones achieve distributed deletes
  • Executing DELETEs and UPDATEs in the future

Constructing materialized views and time series

  • Modeling time series data
  • Enhancing queries with materialized views
  • Materialized views maintained in the application
  • Driving analytics from materialized views

Managing triggers

  • Creating triggers by implementing ITrigger
  • Attaching triggers to tables
  • Supporting materialized views with triggers

Querying Cassandra data with the Datastax Java Driver

  • Connecting to a Cassandra cluster
  • Running CQL through the Java Driver
  • Batching prepared statements
  • Paginating large queries

Persisting Java Objects with Kundera

  • Defining the Java Persistence Architecture, JPA
  • Configuring Kundera to work with Cassandra
  • Generating schemas automatically
  • Managing JPA transactions in Kundera

Leveraging built-in Cassandra connectors

  • Loading data into Hadoop MapReduce with the Cassandra InputFormat
  • Utilizing the Cassandra Loader to create Pig relations
  • Converting a Cassandra table to a Hive table with the Casssandra serializer/deserializer (SerDe)