Data Scientists
15-2051.00
Develop and implement a set of techniques or analytics applications to transform raw data into meaningful information using data-oriented programming languages and visualization software. Apply data mining, data modeling, natural language processing, and machine learning to extract and analyze information from large structured and unstructured datasets. Visualize, interpret, and report data findings. May create dynamic data reports.
Sample of reported job titles: Analytics Consultant, Applied Scientist, Data Analyst, Data Analytic Scientist, Data Analytics Scientist, Data Analytics Specialist, Data Architect, Data Consultant, Data Economist, Data Engineer, Data Management Scientist, Data Mining Analyst, Data Modeler, Data Quality Analyst, Data Science Engineer, Data Scientist, Data Specialist, Data Visualization Developer, Machine Learning Data Scientist, Machine Learning Engineer, Machine Learning Scientist, Marketing Data Scientist, Psychometric Consultant, Quantitative Methodologist, Quantitative Researcher, Research Analyst, Research Scientist, Statistical Analyst, Statistical Consultant, Tableau Developer
Occupation-Specific Information
Tasks
- Analyze, manipulate, or process large sets of data using statistical software.
- Apply feature selection algorithms to models predicting outcomes of interest, such as sales, attrition, and healthcare use.
- Apply sampling techniques to determine groups to be surveyed or use complete enumeration methods.
- Clean and manipulate raw data using statistical software.
- Compare models using statistical performance metrics, such as loss functions or proportion of explained variance.
- Create graphs, charts, or other visualizations to convey the results of data analysis using specialized software.
- Deliver oral or written presentations of the results of mathematical modeling and data analysis to management or other end users.
- Design surveys, opinion polls, or other instruments to collect data.
- Identify business problems or management objectives that can be addressed through data analysis.
- Identify relationships and trends or any factors that could affect the results of research.
- Identify solutions to business problems, such as budgeting, staffing, and marketing decisions, using the results of data analysis.
- Propose solutions in engineering, the sciences, and other fields using mathematical theories and techniques.
- Read scientific articles, conference papers, or other sources of research to identify emerging analytic trends and technologies.
- Recommend data-driven solutions to key stakeholders.
- Test, validate, and reformulate models to ensure accurate prediction of outcomes of interest.
- Write new functions or applications in programming languages to conduct analyses.
Technology Skills
-
Analytical or scientific software -
Google Looker Analytics; IBM SPSS Statistics π₯; Kubeflow; Mathematical software; Mlflow; SAS
; StataCorp Stata; Statistical software; The MathWorks MATLAB π₯
-
Application server software -
Docker π₯; GitHub π₯; Kubernetes π₯
-
Business intelligence and data analysis software -
Alteryx software π₯; Apache Spark
; Business intelligence software; MapReduce big data software; Microsoft Power BI
; Qlik Tech QlikView; Tableau
-
Cloud-based management software -
Amazon Web Services AWS SageMaker; Google Cloud software
-
Content workflow software -
Atlassian JIRA π₯
-
Data base management system software -
Apache Cassandra π₯; Apache Hadoop
; Apache Hive π₯; Apache Pig; Elasticsearch π₯; MongoDB π₯; NoSQL
; Teradata Database π₯
-
Data base reporting software -
Reporting software
-
Data base user interface and query software -
Amazon Elastic Compute Cloud EC2 π₯; Amazon Redshift π₯; Amazon Web Services AWS software
; BigQuery; Microsoft Access π₯; Microsoft SQL Server π₯; Neo4j; NumPy; pandas; PySpark; PyTorch; Structured query language SQL
-
Development environment software -
Apache Kafka π₯; C π₯; Flask; Go π₯; Julia; Microsoft Azure software
; OpenAI ChatGPT; Ruby π₯; Scikit-learn; XGBoost
-
Enterprise application integration software -
Jenkins CI π₯
-
Enterprise resource planning ERP software -
Management information systems MIS
-
Enterprise system management software -
Splunk Enterprise π₯
-
File versioning software -
Git
-
Geographic information system -
Geographic information system GIS systems
-
Industrial control software -
Apache MXNet; TensorFlow
-
Object or component oriented development software -
C# π₯; C++
; Jupyter software; Oracle Java
; Perl π₯; Python
; R
; Scala
; SciPy; Shiny; spaCy
-
Object oriented data base management software -
PostgreSQL π₯
-
Office suite software -
Microsoft Office software π₯
-
Operating system software -
Bash π₯; Keras; Linux π₯; Shell script π₯; UNIX π₯
-
Presentation software -
Microsoft PowerPoint π₯
-
Procedure management software -
Apache Airflow
-
Project management software -
Atlassian Confluence π₯
-
Spreadsheet software -
Microsoft Excel
-
Storage networking software -
Amazon Simple Storage Service S3 π₯
-
Web platform development software -
JavaScript π₯; JavaScript Object Notation JSON π₯; RESTful API
Occupational Requirements
Work Activities
Detailed Work Activities
- Analyze data to inform operational decisions or activities.
- Analyze business or financial data.
- Determine appropriate methods for data analysis.
- Prepare data for analysis.
- Prepare graphics or other visual representations of information.
- Prepare analytical reports.
- Present research results to others.
- Develop procedures to evaluate organizational activities.
- Select resources needed to accomplish tasks.
- Analyze data to identify trends or relationships among variables.
- Analyze data to identify or resolve operational problems.
- Apply mathematical principles or statistical approaches to solve problems in scientific or applied fields.
- Update technical knowledge.
- Advise others on analytical techniques.
- Develop scientific or mathematical models.
- Write computer programming code.