Discover Career Success by Uncovering the Secrets within Data
The 36 credit hour, online Master of Science in Data Science program can be completed within two years, and students also have the option of choosing between two unique concentrations: Computational Biology and Bioinformatics * and Computer Science.
MATH-51000 Mathematics for Data Scientists (3 credits)
Differentiation and integration of functions; basic matrix operations; linearization; linear and nonlinear optimization techniques; clustering and similarity measures, introduction to probability and statistics, basic computational algorithms. Includes frequent illustration of concepts using mathematical computation tools.
- Solve practical discrete mathematics and calculus problems common in statistical learning theory.
- Express linear models and related concepts in matrix algebra.
- Explore functions that model non-linearities in the data.
- Understand Bayesian theory used in common statistical learning applications.
- Explore optimization methods and understand how common iterative algorithms work.
YOUR OPPORTUNITY: You'll be prepared to comprehend and develop tools and techniques for understanding an organization's data.
MATH-51100 Concepts of Statistics I (3 credits)
Distribution of random variables, conditional probability and independence, distributions of functions of random variables, limiting distributions.
- Distinguish among various random distributions and identify which distributions most closely characterize certain natural phenomena.
- Compute conditional probabilities.
- Compute and plot distributions of random variables.
- Solve problems whose data are characterized by specific distribution functions.
- Estimate values and compute confidence intervals.
- Formulate and test statistical hypotheses.
YOUR OPPORTUNITY: As the statistical properties of data sets help identify trends and clarify conclusions, you'll understand where such statistical measures come from.
CPSC-51000 Introduction to Data Mining and Analytics (3 credits)
Overview of the field of data mining and analytics; large-scale file systems and Map-Reduce, measures of similarity, link analysis, frequent item sets, clustering, e-advertising as an application, recommendation systems.
- Describe examples of the kinds of problems confronted by data scientists.
- Identify the technologies used by data scientists to manage and analyze large quantities of data.
- Explain how large-scale file systems are distributed across clusters of machines.
- Explain how Map-Reduce manages and queries data.
- Calculate measures of similarity, clusters, and frequency of item sets.
- Describe the components and function of recommendation systems.
YOUR OPPORTUNITY: You'll breathe life into the mathematics you've learned by pairing it with computing techniques that process and analyze large data sets.
CPSC-51100 Statistical Programming (3 credits)
Programming structures and algorithms for large-scale statistical data processing and visualization. Students will use commonly available data analysis software packages to apply concepts and skills to large data sets and will also develop their own code using an object-oriented programming language.
- Implement algorithms using Python as the programming language.
- Employ various data structures for storing and accessing datasets.
- Identify common problems in datasets and methods of preparing data for analysis.
- Visualize data using an application programmer's interface.
- Develop programs to perform statistical analysis of various kinds of data.
YOUR OPPORTUNITY: You'll build your programming skills as you study and build applications that try to make sense of large collections of data.
CPSC-52500 Encryption and Authentication Systems (3 credits)
(Double-numbered with 68-525) This course will present key cryptologic terms, concepts, and principles. Traditional cryptographic and cryptanalytic techniques are covered plus perspective on successes and failures in cryptologic history, including both single-key algorithms and double-key algorithms. Issues in network communications, network security, and security throughout the different layers of the OSI model for data communications will also be discussed in-depth, as well as the use of cryptologic protocols to provide a variety of security services in a networked environment. Authentication, access control, non-repudiation, data integrity, and confidentiality issues will also be covered, plus key generation, control, distribution and certification issues.
- How and where encryption and authentication are used.
- How to encrypt data using classical techniques.
- What the Law says about encryption, authentication, and digital signatures.
- How symmetric block ciphers like DES, 3-DES, and AES work.
- How symmetric stream ciphers like RC4 and WEP-based encryption work.
- An overview of finite field theory.
- Where to deploy encryption modules and methods.
- How to distribute keys.
- Some theory about prime numbers.
- How public-key encryption and authentication techniques like RSA work.
- How public-key key distribution takes place, including Diffie-Hellman key exchange.
- How to authenticate messages using hash functions and digital signatures.
- What Kerberos and X.509 are.
- How to encrypt and authenticate electronic mail.
- How IPSec works.
- How web-server-to-web-client communications are secured.
YOUR OPPORTUNITY: With big data comes big responsibility. Data must be kept secure and private. This course will teach you how data is made confidential.
CPSC-53000 Data Visualization (3 credits)
The theory and practice of visualizing large, complicated data sets to clarify areas of emphasis. Human factors best practices will be presented. Programming with advanced visualization frameworks and practices will be demonstrated and used in group programming projects.
- Explain how the human brain processes visual information and how to design based on that.
- Select the best ways to present a given data set.
- Write software that presents data in a particular way and that enables the user to interact with it in ways that improve understanding.
- Present data to peers in intuitive, instructive ways.
YOUR OPPORTUNITY: Making sense of complicated data sets requires intuitive displays. You'll learn the theory and practice of data visualization in this course.
CPSC-54000 Large-Scale Data Storage Systems (3 credits)
The design and operation of large-scale, cloud-based systems for storing data. Topics include operating system virtualization, distributed network storage, distributed computing, cloud models (IAAS, PAAS and SAAS), and techniques for securing cloud and virtual systems.
- Distinguish among various cloud models (IAAS, PAAS, and SAAS).
- Explain how clusters of machines share processing and storage responsibility for large problems.
- Design large-scale storage systems that meet problem requirements.
- Explain how virtualization works in terms of operating system concepts.
- Write software that leverages the services of a cloud-based infrastructure.
YOUR OPPORTUNITY: You'll learn how to store large amounts of data in a way that is easy and efficient to query and organize.
CPSC-55000 Machine Learning (3 credits)
Algorithms for enabling artificial systems to learn from experience; supervised and unsupervised learning; clustering, reinforcement learning control. Students will write programs that demonstrate machine-learning techniques.
- Explain various techniques by which machines can learn from data and experience.
- Employ machine learning techniques to solving problems involving large data sets.
- Present results of applying machine learning to big data problems in oral and written form.
YOUR OPPORTUNITY: Data analysis software must dig through data with little guidance, which requires techniques for detecting and learning from cues automatically. You will learn how to code such systems.
Concentration in Computational Biology and Bioinformatics* (12 hours)
BIOL-50900 Introduction to Computational Biology (3 credits)
BIOL-51000 Data Systems in the Life Sciences (3 credits)
BIOL-51200 Research in Biotechnology (3 credits)
BIOL-59000 Data Mining and Analytics Thesis for Life Scientists (3 credits)
Concentration in Computer Science (12 hours)
CPSC-59000 Data Mining and Analytics Project for Computer Scientists (3 credits)
and choose three (3) of the following courses:
MATH-51200 Concepts of Statistics II (3 credits)
CPSC-51700 Pervasive Application Development (3 credits)
CPSC-55200 Semantic Web (3 credits)
CPSC-55500 Distributed Computing Systems (3 credits)
Take the Next Step
Discover more about Lewis University's online Master of Science in Data Science. Call (866) 967-7046 to speak with a Graduate Admissions Counselor or click here to request more information.
*Formerly known as Concentration for Life Sciences.