> job detail
M
๐ฝOther
Member of Technical Staff, Microsoft Robotics (Robotics Data)
Microsoft ยท Redmond, WA,US
// classified as
Other (Adjacent or hard to classify.)
posted
2d ago
location
Redmond, WA,US
languages
python
tools
azure, databricks, grafana
> stack
pythonazuredatabricksgrafanamlflowsparkmatplotlibnumpypandasplotlystreamlit
> education
doctorate
> description
Define and implement data collection strategies for robot learning, including specifying demonstration coverage requirements, environmental diversity targets, task distribution plans, and quality acceptance criteria for teleoperation, egocentric, and autonomous data collection campaigns. Build and maintain data curation pipelines that ingest, clean, validate, label, and version robotics datasets (manipulation demonstrations, navigation trajectories, sensor logs, simulation rollouts), ensuring data integrity and provenance tracking. Develop data analysis frameworks that quantify dataset characteristics (coverage, diversity, balance, quality scores), identify data gaps and biases, and provide recommendations for targeted data collection to improve model performance. Create interactive data visualization tools and dashboards (using tools such as Power BI, Plotly, or custom web applications) that enable researchers, engineers, and leadership to explore dataset properties, model training metrics, evaluation results, and fleet operational telemetry. Collaborate with ML researchers and learning engineers to design and execute experiments that measure the impact of data quantity, quality, and diversity on model performance, producing statistical analyses that guide data investment decisions. Formulate and maintain a roadmap of data science project activity that leads to measurable improvement in model performance metrics, data pipeline efficiency, and data quality over time. Develop and apply statistical techniques (hypothesis testing, causal inference, regression analysis, clustering) to analyze robot performance data, identify failure modes, and uncover patterns that inform model architecture and training strategy decisions. Write efficient, readable, extensible code in Python (including Pandas, NumPy, scikit-learn, matplotlib) for data processing, analysis, and visualization, building professional-grade documentation for knowledge transfer. Adhere and contribute to ethics and privacy policies related to collecting and preparing robotics data, providing guidance on responsible data practices including bias detection, consent, and data governance. Present results and findings to senior stakeholders, using compelling visualizations and storytelling to influence data investment priorities and model development strategy. Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 5+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 3+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 1+ year(s) data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 7+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 5+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 3+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR equivalent experience. Experience with robotics data pipelines, including teleoperation demonstration data, sensor logs, simulation rollouts, or autonomous driving datasets. Proficiency in data visualization tools and frameworks (Power BI, Plotly, D3.js, Streamlit, Grafana, or equivalent) for building interactive operational and analytical dashboards. Experience with large-scale data processing frameworks (Apache Spark, Databricks, Azure Data Factory, or equivalent) for handling multimodal robotics datasets. Familiarity with machine learning model evaluation methodologies, including benchmark design, ablation studies, and statistical significance testing for robotics applications. Understanding of data governance, data versioning (DVC, MLflow, or equivalent), and data quality assurance practices for ML training datasets.