Research Scientist - Data Job at Storm3, Alameda, CA

UUtnMjh4SHVOdHl4NVE4d3VRMFJzWnkyOVE9PQ==
  • Storm3
  • Alameda, CA

Job Description

⚡ Research Scientist - Data focus

💊 Foundation Models, AI Research Institute

🌎 San Francisco Bay Area, USA

💸 $200,000 - $350,000 salary + bonus

Come join a revolutionary AI research lab in SF Bay Area that is poised to develop & publish high-impact breakthroughs in GenAI - across LLMs and Multimodal AI.

As part of the team, you’ll work at the intersection of data, large-scale training, and foundation model innovation. You will collaborate with world-class researchers, data scientists, and engineers to solve critical challenges in creating robust, scalable, and reasoning-capable LLMs. Your research will shape the way data is curated, processed, and leveraged to train the next generation of intelligent systems.

Responsibilities:

  • Lead research on data-centric approaches for LLMs , including pretraining corpus design, data valuation, and speculative decoding strategies.
  • Develop pipelines to process challenging data sources into structured and reproducible training datasets.
  • Build and optimize agentic data pipelines , integrating retrieval, self-curation, and multi-agent feedback for high-quality training and evaluation data.
  • Collaborate with researchers on alignment and reasoning-focused training that leverage data-driven approaches for improving LLM capabilities.
  • Prototype and deploy evaluation frameworks to measure data quality, coverage, and downstream impact on LLM reasoning.
  • Publish findings at top-tier venues (e.g., NeurIPS, ICLR, ACL, EMNLP) and represent the institute at international conferences.
  • Contribute to open-source tools, datasets, and benchmarks that advance the global foundation model research community.

Requirements:

  • Master’s degree in Computer Science, Data Science, or a related technical field (PhD strongly preferred)
  • Experience collecting and curating high-quality text data including multi-lingual data.
  • Hands-on experience with large-scale dataset curation and preprocessing for ML/LLM training.
  • Prior works synthesizing complex datasets. Code, math, and agentic data are higher priority
  • Experience with ML infrastructure for scalable training, evaluation, and debugging .
  • Experience at the intersection of data and post-training (RL/SFT)
  • Proven ability to independently drive research questions related to data quality, scaling, or reasoning .

Preferred Experience:

  • Experience with retrieval-augmented generation (RAG) , agentic data pipelines, or reasoning benchmarks.
  • Contributions to speculative decoding, self-curation, or reinforcement learning from synthetic data .
  • Background in knowledge graphs, semantic search, or indexing systems .
  • Strong publication record in leading AI conferences.
  • Prior contributions to open-source ML data tools or benchmarks .
  • Prior work on speculative decoding/contributions to LLM serving engines
  • Prior work on training LLM-as-a-judge
  • Deep expertise with tokenization/training tokenizers

Why apply:

  • Opportunity to build out a new division at the forefront of AI innovation
  • FAANG competitive salary & package
  • Work alongside superstars from FAANG labs & leading AI companies
  • Medical, Dental and Vision Insurance
  • Relocation package available

🌎 San Francisco Bay Area, USA

📧 Interested in applying? Please click on the ‘Easy Apply’ button or alternatively email me your resume at [email protected]

Job Tags

Relocation package,

Similar Jobs

Archdiocese of Omaha

Spanish Teacher Job at Archdiocese of Omaha

 ...St. Pius X/St. Leo School is seeking a passionate, dedicated Spanish Teacher to join our faith-filled, diverse learning community for...  ...inclusive classroom environment Integrate Catholic values and teachings into Spanish instruction and classroom culture Assess student... 

Regional Groundwork

Entry-Level Asphalt Crew Member Job at Regional Groundwork

 ...-site role located in Tulsa, OK, for an Entry-Level Asphalt Crew Member at Regional Groundwork...  ...Crew Member will be responsible for assisting with the preparation and application of...  ...15 an hour with overtime available. PTO, health and dental insurance can be offered.... 

Pyramid Consulting, Inc

Registered Nurse - Endoscopy Job at Pyramid Consulting, Inc

 ...Immediate need for a talented Registered Nurse - Endoscopy . This is a Fulltime opportunity with long-term potential and is located in Augusta, Georgia(Onsite). Please review the job description below and contact me ASAP if you are interested. Job ID:25-93661... 

Spenga

Fitness Instructor and Motivator Extraordinaire Job at Spenga

 ...Do you have the energy of a thousand jumping jacks, the enthusiasm to rival a cheer squad, and a passion for making fitness feel like the ultimate joyride? Spenga Tucson is on the lookout for a vibrant and dynamic Fitness Fun-Maker to join our team! If you're ready to... 

SSi People

Customer Service Specialist Job at SSi People

 ...Job Responsibilities: Deliver a high-quality experience from start to finish, offering program information, eligibility, customer support, and general assurances. Handle inbound and outbound calls with patients, physicians, and pharmacies, focusing on empathy and...