Together AI

Together AI

Software Development

San Francisco, California 25,013 followers

The future of AI is open-source. Let's build together.

About us

Together AI is a research-driven artificial intelligence company. We contribute leading open-source research, models, and datasets to advance the frontier of AI. Our decentralized cloud services empower developers and researchers at organizations of all sizes to train, fine-tune, and deploy generative AI models. We believe open and transparent AI systems will drive innovation and create the best outcomes for society.

Website
https://together.ai
Industry
Software Development
Company size
51-200 employees
Headquarters
San Francisco, California
Type
Privately Held
Founded
2022
Specialties
Artificial Intelligence, Cloud Computing, LLM, Open Source, and Decentralized Computing

Locations

  • Primary

    251 Rhode Island St

    Suite 205

    San Francisco, California 94103, US

    Get directions

Employees at Together AI

Updates

  • View organization page for Together AI, graphic

    25,013 followers

    Pika, a video generation company,  leveraged our GPU Clusters and Inference solution to scale their AI video generation capabilities, enabling them to build a text-to-video model in just 6 months, save over $1 million in costs, and grow to millions of videos generated monthly. Together AI Acceleration Cloud offers: • Up to 10k+ Frontier GPU Clusters: Optimized for foundation model training with multi-user environment and SSH access. • End-to-End Platform: Training, fine-tuning, and hosting custom models for inference. • Flexible Commits: Terms starting from one month with scheduled buildup options. • Premium Support: Included with every cluster. • Cutting-Edge Architectures: Support for state space models like Mamba and Striped-Hyena • Industry-Leading Research: 9x faster training with FlashAttention-3 optimization Ready to accelerate your AI development?  Reserve a top-spec H100 cluster for training and fine-tuning today here https://lnkd.in/g6dAtUiq

    • No alternative text description for this image
  • Together AI reposted this

    View profile for Thomas Gburek, graphic

    Helping Startups Accelerate with NVIDIA

    Incredible news to share with team at Together AI! If you want the best enterprise grade infrastructure managed and supported by NVIDIA with the best available inference at scale with Together's Inference Engine, its now possible! This groundbreaking partnership will enable any company to build the best possible AI solutions using the best of both worlds. I'm happy to connect to the appropriate parties on both sides if anyone is interested.

    View organization page for Together AI, graphic

    25,013 followers

    We are thrilled to announce our collaboration with NVIDIA that brings the industry-leading Together Inference Engine to NVIDIA AI Foundry customers. This empowers enterprises and developers to leverage openly available models like Llama 3.1 running on the Together Inference Engine on NVIDIA DGX Cloud. Developers and enterprises can also fine-tune the models with their proprietary data to achieve higher accuracy and performance and continue to maintain ownership of their data and models. The Together Inference Engine is built on innovations including FlashAttention-3 kernels, custom-built speculators based on RedPajama, and the most accurate quantization techniques available on the market. These advancements enable enterprise workloads to be highly optimized for NVIDIA Tensor Core GPUs allowing them to build and run generative AI applications on open-source models with unmatched performance, accuracy, and cost-efficiency at production scale. With this collaboration, businesses with sophisticated workloads on DGX Cloud can deploy open-source models into production faster on NVIDIA-optimized infrastructure, paired with the Together AI accelerated inference stack with unmatched performance, scalability, and security. https://lnkd.in/gt7xGzW2

    • No alternative text description for this image
  • View organization page for Together AI, graphic

    25,013 followers

    We just built a full-stack AI tutor example app with the new Llama 3.1! It's called LlamaTutor, it's free, and it's fully open source: Type in a topic and what education level you want to be taught at and you will get a personalized chatbot that uses up to date sources from the internet to teach you the material in an interactive way. Check out the app here → https://llamatutor.com/ The code is also available → https://lnkd.in/e2c58hhC It's built with: ◆ Together AI's inference w/ Llama-3.1 (AI API) ◆ Serper for looking up sources (search API) ◆ Next.js app router, TypeScript, and tailwind ◆ Helicone (YC W23) for AI observability ◆ Plausible Analytics for analytics Llama 3.1 just came out today and we wanted to give developers an open source app to be able to easily use it. You can call Llama 3.1 405B, 70B, or 8B through our platform with an API with just a few lines of code.

  • Together AI reposted this

    View profile for Hassan El Mghari, graphic

    Building at Together.ai

    I built a free and open source AI personal tutor with the new Llama 3.1! It's called LlamaTutor: Type in a topic and what education level you want to be taught at and you will get a personalized chatbot that uses up to date sources from the internet to teach you the material in an interactive way. Check out the app here → https://llamatutor.com/ The code is also available → https://lnkd.in/e2c58hhC It's built with: ◆ Together AI's inference w/ Llama-3.1 (AI API) ◆ Serper for looking up sources (search API) ◆ Next.js app router hosted on Vercel ◆ TypeScript for the language & Tailwind for CSS ◆ Helicone (YC W23) for AI observability ◆ Plausible Analytics for analytics #ai #opensource #artificialintelligence

  • View organization page for Together AI, graphic

    25,013 followers

    We’re thrilled to be a launch partner for the MongoDB AI Applications Program (MAAP). This program allows developers and enterprises to build and run generative AI applications with the best performance, accuracy, and cost on the Together Platform while allowing them to keep ownership of their models and their data secure. Discover easy & seamless RAG-based app development using MongoDB Atlas Vector Search and Together AI’s Embedding & Inference endpoints. Partnered with @MongoDB, we're thrilled to offer an enterprise grade end-to-end solution for retrieval-augmented generation, semantic search, and conversational AI. Easily build and deploy RAG-based applications with MongoDB Atlas Vector Search and Together AI’s Embedding & Inference endpoints. Empower your data for impactful end-user use cases like support and semantic search. Leverage Together AI’s API to generate embeddings using top open-source models, seamlessly integrated with your Atlas vector search index. Enhance your applications with real-time inference capabilities for unparalleled user experiences. Together AI and MongoDB bring scalable, production-ready solutions to customer environments. Achieve optimal performance and efficiency across all workloads with our integrated solutions. Explore how our technology can elevate your applications: https://lnkd.in/gNiJANvi

    Together AI - Partner Ecosystem | MongoDB

    Together AI - Partner Ecosystem | MongoDB

    cloud.mongodb.com

  • View organization page for Together AI, graphic

    25,013 followers

    Today marks an inflection point for open-source AI with the launch of AI at Meta Llama 3.1 405B, the largest openly available foundation model, that rivals the best closed-source models in AI, rapidly accelerating the adoption of open-source AI with developers and enterprises. We are excited to partner with Meta to bring all the Llama 3.1 models (8B, 70B, 405B, and LlamaGuard) to Together Inference and Together Fine-tuning. Together Inference delivers horizontal scalability with industry-leading performance of up to 80 tokens per second for Llama 3.1 405B and up to 400 tokens per second for Llama 3.1 8B, which is 1.9x to 4.5x faster than vLLM while maintaining full accuracy with Meta’s reference implementation across all models. Together Turbo endpoints are available at $0.18 for 8B and $0.88 for 70B, 17x lower cost than GPT-4o. This empowers developers and enterprises to build Generative AI applications at production scale in their chosen environment – Together Cloud (serverless or dedicated endpoints) or on private clouds. As the launch partner for the Llama 3.1 models, we're thrilled for customers to leverage the best performance, accuracy, and cost for their Generative AI workloads on the Together Platform while allowing them to keep ownership of their models and their data secure. Function calling is supported natively by each of the models, and JSON mode is available for the 8B and 70B models (coming soon for the 405B model). Together Turbo endpoints empower businesses to prioritize performance, quality, and price without compromise. It provides the most accurate quantization available for Llama-3.1 models, closely matching full-precision FP16 models. These advancements make Together Inference the fastest engine for NVIDIA GPUs and the most cost-effective solution for building with Llama 3.1 at scale. https://lnkd.in/gFwBNQhJ

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
      +1
  • View organization page for Together AI, graphic

    25,013 followers

    We are thrilled to announce our collaboration with NVIDIA that brings the industry-leading Together Inference Engine to NVIDIA AI Foundry customers. This empowers enterprises and developers to leverage openly available models like Llama 3.1 running on the Together Inference Engine on NVIDIA DGX Cloud. Developers and enterprises can also fine-tune the models with their proprietary data to achieve higher accuracy and performance and continue to maintain ownership of their data and models. The Together Inference Engine is built on innovations including FlashAttention-3 kernels, custom-built speculators based on RedPajama, and the most accurate quantization techniques available on the market. These advancements enable enterprise workloads to be highly optimized for NVIDIA Tensor Core GPUs allowing them to build and run generative AI applications on open-source models with unmatched performance, accuracy, and cost-efficiency at production scale. With this collaboration, businesses with sophisticated workloads on DGX Cloud can deploy open-source models into production faster on NVIDIA-optimized infrastructure, paired with the Together AI accelerated inference stack with unmatched performance, scalability, and security. https://lnkd.in/gt7xGzW2

    • No alternative text description for this image
  • View organization page for Together AI, graphic

    25,013 followers

    Join our founder Vipul Ved Prakash on the Kleiner Perkins Grit Podcast with Joubin Mirzadegan and Bucky Moore as they deep dive into Together AI, the dynamic AI landscape, and running an AI business. 🎧

    🎉 Exciting milestone alert! 🎉 Tune in to the 2️⃣ 0️⃣ 0️⃣ th episode of Grit, featuring an insightful conversation with Vipul Ved Prakash, CEO and co-founder of Together AI, and Bucky Moore, partner at Kleiner Perkins. They dive deep into the future of AI, startup challenges, and the journey of Together AI. Don't miss this special episode! 🎧 https://lnkd.in/gtinZ3kS 📺 https://lnkd.in/gk2u6i9n #GritPodcast #AI #StartupJourney #TogetherAI #KleinerPerkins

Similar pages

Funding