search menu icon-carat-right cmu-wordmark

Big Data: Architectures and Technologies - eLearning

Scalable "big data" systems are significant long-term investments that must scale to handle ever-increasing data volumes, and therefore represent high-risk applications in which the software and data architecture are fundamental components of ensuring success. This online course is presented to you by the Software Engineering Institute's research scientists Ian Gorton and John Klein. Through the use of video instruction, exercises, and knowledge checks, the course focuses on the relationships among big data application software, data models, and deployment architectures and how specific technology selection relates to all of these.

The course contains six hours of video instruction, self-assessment quizzes, and exercises. A copy of the course slides are available to download.

This course is also offered as instructor-led training.


  • Architects
  • Technical stakeholders involved in the development of big data applications
  • Product managers, development managers, and systems engineers


At the completion of the course, learners will understand:

  • What "big data" is, how and why it has evolved, and the technologies that have emerged to address its complexities in the realm of computer science and software engineering
  • The basics of distributed systems, including durability, transactional consistency, and replica consistency
  • The quality attributes important in distributed systems and how they are achieved in practice
  • Specific technologies, such as NoSQL and NewSQL databases
  • Data modeling and the common types of data that need modeling
  • Performance considerations in data modeling
  • Distributed data processing frameworks employed in big data systems, such as Hadoop and its associated HDFS file system, which support downstream activities
  • The newly emerging distributed data processing frameworks
    • Distributed computations with Spark
    • Stream processing with Storm
  • Architectural issues present when building big data systems
  • Big data system design tactics
  • Software engineering heuristics to achieve effective, reliable, and scalable software systems


  • The major elements of big data software architectures
  • The different types and major features of NoSQL databases
  • Patterns for designing data models that support high performance and scalability
  • Distributed data processing frameworks


This course is taught by SEI research scientists Ian Gorton and John Klein by means of recorded video lectures. Following each lecture is a self-assessment or exercise to assist your comprehension of the concepts presented. Learners will also be able to access additional resources related to the subject matter, including a downloadable copy of the course presentation slides.


This course has no prerequisites.

To access the SEI Learning Portal, your computer must have the following:

  • For optimum viewing, we recommend using the following browsers: Chrome, Mozilla Firefox, Internet Explorer 8 or above, Safari 4 or above
  • These browsers are supported on the following operating systems: MS Windows 7 and above, OSX, most Linux distributions
  • Mobile Operating Systems: iOS default browsers versions 6 and 7; Android versions higher than 4.2

This is an eLearning course.

Register Now

Course Fees [USD]

  • eLearning: $400.00


Learners will have 90 days to complete the course. Upon completing all course elements, the learner is awarded an electronic certificate of course completion.

If you wish to purchase this course for a group of learners, please email or telephone at +1 412-268-7622 for group rate details.

Course Questions?

Phone: 412-268-7388
FAX: 412-268-7401

Training courses provided by the SEI are not academic courses for academic credit toward a degree. Any certificates provided are evidence of the completion of the courses and are not official academic credentials. For more information about SEI training courses, see Registration Terms and Conditions and Confidentiality of Course Records.