HPU | Master of Science in Data Science Online

Master of Science in Data Science Online

The Master of Science in Data Science is a comprehensive program designed to immerse students in cutting-edge technology and methodology in advanced data science applications. Data science skills, such as data wrangling, machine learning, coding, and data visualizations, are increasingly necessary across disciplines including technology, science, finance, marketing, healthcare, and social sciences. This program ensures that graduates will be equipped with the skills demanded in the fast-evolving landscape of the industry

Topics Include:  

  • Artificial Intelligence  
  • Big Data Analytics 
  • High-performance Computing 
  • Cloud Computing 

Request Info

Why HPU?

The Master of Science in Data Science is a comprehensive program designed to immerse students in cutting-edge technologies and advanced data analytical methods. This transformative curriculum ensures that graduates will be equipped with the skills to tackle ongoing real-world challenges. 

HPU’s fully online program will allow students the flexibility to complete the degree on their own time, from any location, making it accessible to diverse learners from across the globe. Whether in finance, healthcare, technology, or any background, the skills acquired during the program opens doors to future employment opportunities. 

Job Outlook

The U.S. Bureau of Labor and Statistics projects a 35% growth in data science employment from 2022 to 2032, making it one of the fastest-growing fields. Jobs in artificial intelligence and machine learning, which are closely related to data science, are estimated to grow 21% from 2021 to 2031. About 17,700 job openings for data scientists are projected each year, indicating a high demand for skilled professionals to interpret data and provide actionable recommendations for improving business outcomes. 

Program Resources

Prerequisites: College level courses in statistics, linear algebra, and coding in any language, e.g. python or R (are strongly recommended but not required for admission). 

  • Online application ($55 app fee) 
  • Official transcript/s (bachelor’s degree or higher) 
  • Resume 
  • English proficiency score required for international applicants. More information can be found online at hpu.edu/gradintl 

Optional: 

  • GMAT/GRE Score 
  • Personal Statement/Essay 
  • Letters of Recommendation 

  1. Use mathematical theory to design statistical models and estimate coefficients and uncertainty 
  2. Perform the six steps of data wrangling: discovery, structuring, cleaning, enriching, validating, and publishing 
  3. Write code in a programming language prominent in the field of data science, to clean, analyze, visualize, and create models from data 
  4. Distinguish learning problems, select machine learning and deep learning models, and implement a training algorithm 
  5. Create and present effective data visualizations 
  6. Apply a framework to evaluate ethical issues in artificial intelligence and data science

 

DSCI 6000 Applied Statistics and Data Science (3 credits)  

DSCI 6000 - Applied Statistics and Data Science Prerequisite: Graduate Standing This course offers an overview of three distinct yet interconnected perspectives: Classical statistics, Bayesian statistics, and Data Science/Machine Learning (DSML). Classical statistics emphasizes rigorous inferences rooted in the frequentist school whereas the Bayesian school offers a probabilistic framework that enables the incorporation of prior knowledge, updating beliefs, and modeling uncertainty. DSML aims to extract insights and patterns from data and building predictive models. Credit: 3 

 

DSCI 6100 Programming for data scientists (Python) (3 credits) 

DSCI 6100 - Programming for Data Scientist (Python) An introduction to programming in the popular Python programming language. Topics include data types, simple statements, control structures, strings, functions, recursion, the Python interpreter, system command lines and files, module imports, object types, dynamic typing, scope, classes, operator overloading, exceptions, testing, and debugging. The course will enable students to program fluently in Python and move on to advanced topics such as programming collective intelligence and natural language processing. Credit: 3 

DSCI 6200 Data Science and Machine Learning (3 credits) 

DSCI 6200 - Data Science and Machine Learning This course provides an overview of modern data science and machine learning techniques, contrasting them with a traditional statistical approach. Students will learn how analysts can transition from classical statistics to more advanced predictive modeling and algorithmic data analysis. The course will cover the theoretical and applied aspects of powerful DSML tools, such as neural networks, support vector machines, decision trees, random forest, gradient boosting, XGBoosting, model selection, model averaging, cluster analysis, and text mining. Upon completing this course, students will understand how to leverage modern modeling techniques to extract insights, predict outcomes, and optimize decisions. Credit: 3 

  

DSCI 6300 Data Visualization (3 credits)  

DSCI 6300 - Data Visualization This course covers principles and tools for effectively visualizing and communicating data-driven insights. The focus will be on extracting and communicating patterns from data through interactivity and synthesis of complex information. Aligned with the exploratory data analysis paradigm, emphasis will be placed on using visualizations to ask and answer "what-if" questions about data. Topics of this course include, but are not limited to, univariate data visualization, high-dimensional data visualization, visualization for trend-based data, visualization for spatial data, and dashboarding. Through hands-on assignments, students will gain skills in creating insightful, impactful data graphics using leading dynamic visualization tools. Credit: 3 

 

DSCI 6400 Ethics in data science and artificial intelligence (3 credits)  

DSCI 6400 - Ethics in Data Science and Artificial Intelligence This course provides an overview of ethical data-related issues, particularly on artificial intelligence, machine learning, and big data. Students will gain an understanding of current debates, frameworks, and regulations regarding data ethics. Key topics include privacy and confidentiality, transparency and explainability, bias and fairness, copyright and intellectual properties, as well as misuse prevention and safety. Credit: 3 

 

CYBS 6020 Cloud Computing Platforms, Applications, and Data Security (3 credits)  

CYBS 6020 - Cloud Computing Platforms, Applications, and Data Security This course provides an overview of vendor-independent cloud computing technology concepts and methods. Several cloud providers along with their tools will be referenced. Students will learn specifics about software as a service (SaaS), platform as a service (PaaS), infrastructure as a service (IaaS), server and desktop virtualization, and more. Specific topics include cloud-related security risks and threats, cloud architecture and design, and operations and support. Credit: 3 

 

DSCI 6600 Data wrangling with SQL (3 credits)  

DSCI 6600 - Data Wrangling with SQL This hands-on course provides the skills to wrangle, clean, transform, and munge data using Structured Query Language (SQL). Students will learn SQL programming techniques to deal with common data issues such as missing values, duplicate records, parsing errors, inconsistent formats, and integrating from different sources. Credit: 3 

 

DSCI 6700 Text mining and unstructured data (3 credits) 

DSCI 6700 - Text Mining and Unstructured Data This course introduces techniques for extracting insights from unstructured textual, visual, audio, and video data. Students learn text-mining tools to analyze patterns in textual corpora and acquire skills for organizing and making sense of other unstructured data types. Topics include, but are not limited to, text mining algorithms like classification, clustering, and sentiment analysis, Web scraping and collection of online text data, audio, and video feature extraction techniques, as well as image classification and object recognition. Through hands-on assignments and projects, students will gain practical experience applying text mining, computer vision, and other unstructured data analysis techniques on real-world datasets. Credit: 3 

 

DSCI 6800 Artificial Intelligence and machine learning (3 credits)  

DSCI 6800 - Artificial Intelligence and Machine Learning This course provides a broad overview of the fields of artificial intelligence and machine learning. Students will learn fundamental concepts and algorithms that enable computers to mimic human intelligence for tasks like pattern recognition, prediction, optimization, and decision-making. Topics in this course include, but are not limited to, supervised learning algorithms, unsupervised learning algorithms, reinforcement learning for sequential decision-making, deep learning using multiple hidden layers, natural language processing for text and speech, computer vision for image and video processing, generative AI (e.g., ChatGPT, Midjourney, Stable Diffusion…etc.), AI ethics, biases, and social impact. In this course, students will gain hands-on experience applying AI techniques and machine learning algorithms to build intelligent systems. Programming will be done in languages like Python. Credit: 3 

 

DSCI 7000 Data Science Capstone (3 credits) 

DSCI 7000 - Data Science Capstone This capstone course provides the culminating experience for students in the Master's in Data Science program. Working individually or in a team, students will conceptualize, propose, and execute an end-to-end data science project using real-world big data. The project will integrate skills and concepts learned throughout the program, including statistical analysis, machine learning, and communication of results. Under the instructor’s guidance, students will identify a problem amenable to data science techniques, acquire appropriate datasets, perform exploratory data analysis, implement data cleaning, and feature engineering pipelines, train machine learning models, and measure model performance. Credit: 3