Hugging Face

Software Development

The AI community building the future.

See jobs Follow

View all 394 employees

About us

The AI community building the future.

Website: https://huggingface.co
External link for Hugging Face
Industry: Software Development
Company size: 51-200 employees
Type: Privately Held
Founded: 2016
Specialties: machine learning, natural language processing, and deep learning

Products

Hugging Face

Natural Language Processing (NLP) Software

We’re on a journey to solve and democratize artificial intelligence through natural language.

Locations

Primary

Get directions
Paris, FR

Get directions

Employees at Hugging Face

See all employees

Updates

Hugging Face reposted this

Aymeric Roucher

Machine Learning Engineer @ Hugging Face 🤗 | Polytechnique - Cambridge
21h Edited
Report this post
𝗔𝗴𝗲𝗻𝘁𝗶𝗰 𝗗𝗮𝘁𝗮 𝗮𝗻𝗮𝗹𝘆𝘀𝘁: 𝗱𝗿𝗼𝗽 𝘆𝗼𝘂𝗿 𝗱𝗮𝘁𝗮 𝗳𝗶𝗹𝗲, 𝗹𝗲𝘁 𝘁𝗵𝗲 𝗟𝗟𝗠 𝗱𝗼 𝘁𝗵𝗲 𝗮𝗻𝗮𝗹𝘆𝘀𝗶𝘀 📊⚙️ Need to make quick exploratory data analysis? ➡️ Get help from an agent. I was impressed by Llama-3.1's capacity to derive insights from data. Given a csv file, it makes quick work of exploratory data analysis and can derive interesting insights. On the data from the Kaggle titanic challenge, that records which passengers survived the Titanic wreckage, it was able by itself to derive interesting trends like "passengers that paid higher fares were more likely to survive" or "survival rate was much higher for women than men". The cookbook even lets the agent built its own submission to the challenge, and it ranks under 3,000 out of 17,000 submissions: 👏 not bad at all! Try it for yourself in this Space demo 👉 https://lnkd.in/gzaqQ3rT Read the cookbook to dive deeper 👉 https://lnkd.in/gXx3-AyH

24 Comments

Like Comment Share
Hugging Face reposted this

Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
22h
Report this post
Excited to share that Hugging Face Inference Endpoints now support shared VPC Endpoints and Secrets. > With Shared Private Services, you can connect multiple Inference Endpoints to the same AWS VPC endpoints. Your requests never touch the public internet, secured by VPC-to-VPC communication. Setup once use it for all your inference endpoints > Secrets provide a secure way to manage credentials or other sensitive information for custom container deployments or custom handlers. Get Started: https://lnkd.in/ejEJqyM5 Documentation: https://lnkd.in/dWD5gCQB
4 Comments

Like Comment Share
Hugging Face reposted this

Abhishek Thakur

AutoTrain @ 🤗 | GitHub ⭐️ | 1st 4x Kaggle GrandMaster ✨ | 150k+ LinkedIn, 100k+ YouTube 🚀
1d
Report this post
🚨 NEW TASK ALERT: VLM Finetuning 🚨 AutoTrain just added VLM finetuning: Captioning and VQA for PaliGemma. Now, its super-easy to finetune PaliGemma on your own custom dataset. Which model and tasks would you like to see next? Let us know in Github Repo! 🚀
10 Comments

Like Comment Share
Hugging Face reposted this

Abhishek Thakur

AutoTrain @ 🤗 | GitHub ⭐️ | 1st 4x Kaggle GrandMaster ✨ | 150k+ LinkedIn, 100k+ YouTube 🚀
1d
Report this post
Llama 3.1 llama-ing perfectly fine with AutoTrain 🤗
2 Comments

Like Comment Share
Hugging Face reposted this

Vaibhav Srivastav

GPU poor @ Hugging Face
2d
Report this post
Meta Llama 3.1 405B, 70B & 8B are here - Multilingual & with 128K context & Tool-use + agents! Competitive/ beats GPT4o & Claude Sonnet 3.5 unequivocally the best open LLM out there! 🔥 Bonus: It comes with a more permissive license, which allows one to train other LLMs on its high-quality outputs 🐐 Some important facts: > Multilingual - English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai. > MMLU - 405B (85.2), 70B (79.3) & 8B (66.7) > Trained on 15 Trillion tokens + 25M synthetically generated outputs. > Pre-training cut-off date of December 2023 > Same architecture as Llama 3 with GQA > Used a massive 39.3 Million GPU hours (16K H100s for 405B) > 128K context ⚡ > Excels at Code output tasks, too! > Release Prompt Guard - BERT-based classifier to detect jailbreaks, malicious code, etc > Llama Guard 8B w/ 128K context for securing prompts across a series of topics How much GPU VRAM do you need to run these? 405B - 810 GB in fp/bf16, 405 GB in fp8/ int8, 203 GB in int4 70B - 140 GB in fp/bf16, 70 GB in fp8/ int8, 35 GB in int4 8B - 16 GB in fp/bf16, 8 GB in fp8/ int8 & 4 GB in int4 In addition, we provide a series of Quants ready to deploy: AWQ, Bitsandbytes, and GPTQ. These allow you to run 405B in as little as 4 x A100 (80GB) through TGI or VLLM. 🔥 Wait, it improves; we also provide unlimited access to HF Pro users via our deployed Inference Endpoint! Want to learn more? We wrote a detailed blog post on it 🦙 Kudos to AI at Meta for believing in open source and science! It has been fun collaborating! 🤗
14 Comments

Like Comment Share
Hugging Face reposted this

Gradio

24,661 followers
2d Edited
Report this post
🚀 Meta unveils Llama 3.1: This release changes everything!🤯 A bullet-point summary👇 of all the technical details for Meta's Llama 3.1 release that you need to know: • Llama 3.1 comes in three sizes: 8B, 70B, and 405B parameters • All models support a context length of 128K tokens • New licensing terms allow using model outputs to improve other LLMs • Models trained on over 15 trillion tokens • Instruct models trained on publicly available instruction datasets and over 25M synthetically generated examples • Models are multilingual, support 8 languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai • Six new open LLM models released: - Meta-Llama-3.1-8B (base) - Meta-Llama-3.1-8B-Instruct (fine-tuned) - Meta-Llama-3.1-70B (base) - Meta-Llama-3.1-70B-Instruct (fine-tuned) - Meta-Llama-3.1-405B (base) - Meta-Llama-3.1-405B-Instruct (fine-tuned) • Two additional models released: - Llama Guard 3: For classifying LLM inputs and responses - Prompt Guard: A 279M parameter BERT-based classifier for detecting prompt injection and jailbreaking • Uses Grouped-Query Attention (GQA) for efficient representation • Instruct models are fine-tuned for tool calling with two built-in tools (search, mathematical reasoning with Wolfram Alpha) • Supports four conversation roles: system, user, assistant, and ipython (for tool call outputs) • Custom tool calling supported via JSON function calling • Official FP8 quantized version of Llama 3.1 405B available • AWQ and GPTQ quantized variants in INT4 also available • Memory requirements (approx~): - 8B model: 16 GB (FP16), 8 GB (FP8), 4 GB (INT4) - 70B model: 140 GB (FP16), 70 GB (FP8), 35 GB (INT4) - 405B model: 810 GB (FP16), 405 GB (FP8), 203 GB (INT4) • KV cache memory requirements (in FP16) for 128k tokens: - 8B model: 15.62 GB - 70B model: 39.06 GB - 405B model: 123.05 GB Read everything about the Meta Llama 3.1 release on the detailed report on Hugging Face Blog here https://lnkd.in/g9yTBFnv Meta Llama 3.1 model Collection on Hugging Face: https://lnkd.in/g_bVRpmp A Gradio demo for the Meta Llama 3.1 8b is hosted on Hugging Face Spaces: https://lnkd.in/gKD6BYiW

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

huggingface.co

2 Comments

Like Comment Share
Hugging Face reposted this

Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
2d
Report this post
Llama 405B is here, and it comes with more than expected! 🚨 Meta Llama 3.1 comes in 3 sizes, 8B, 70B, and 405B, and speaks 8 languages! 🌍 Llama 3.1 405B matches or beats the Openai GPT-4o across many text benchmarks. New and improvements of 3.1✨: 🧮 8B, 70B & 405B versions as Instruct and Base with 128k context 🌐 Multilingual, supports 8 languages, including English, German, French, and more. 🔠 Trained on >15T Tokens & fine-tuned on 25M human and synthetic samples 📃 Commercial friendly license with allowance to use model outputs to improve other LLMs ⚖️ Quantized versions in FP8, AWQ, and GPTQ for efficient inference. 🚀 Llama 3 405B matches and beast GPT-4o on many benchmarks 🧑🏻💻 8B & 70B improved Coding and instruction, following up to 12% ⚒️ Supports Tool use and Function Calling 🤖 Llama 3.1 405B available on Hugging Face Inference API and in HuggingChat 🤗 Available on @huggingface 🔜 1-click deployments on Hugging Face, Amazon SageMaker, Google Cloud Blog: https://lnkd.in/eiRsPgDj Model Collection: https://lnkd.in/ehpTfzMq Big Kudos to Meta for releasing Llama 3.1, including 405B. This will help everyone accelerate and adopt AI more easily and faster. ❤️
59 Comments

Like Comment Share
Hugging Face reposted this

Sayak Paul

ML @ Hugging Face 🤗
3d
Report this post
Establishing strong automated reporting mechanisms is important to sustain a good open-source project. On the 🧨 Diffusers team, we do this by reporting: 🟢 Status of our nightly test suite 🟢 Status of the nightly Docker builds (if any) 🟢 Mirroring status of the community pipelines 🟢 Bi-weekly benchmarking The nightly test suite helps us discover any nasty bug relatively quickly. The benchmarking suite helps dive deep into any creepy if we see a slowdown in the numbers. All of these are reported to specific Slack channels with specific members to reduce the noise. Check out the workflows for more details: https://lnkd.in/g69zkwK2
2 Comments

Like Comment Share
Hugging Face reposted this

Ahsen Khaliq

ML @ Hugging Face
4d
Report this post
Apple presents LazyLLM Dynamic Token Pruning for Efficient Long Context LLM Inference paper page: https://buff.ly/4d9En9A The inference of transformer-based large language models consists of two sequential stages: 1) a prefilling stage to compute the KV cache of prompts and generate the first token, and 2) a decoding stage to generate subsequent tokens. For long prompts, the KV cache must be computed for all tokens during the prefilling stage, which can significantly increase the time needed to generate the first token. Consequently, the prefilling stage may become a bottleneck in the generation process. An open question remains whether all prompt tokens are essential for generating the first token. To answer this, we introduce a novel method, LazyLLM, that selectively computes the KV for tokens important for the next token prediction in both the prefilling and decoding stages. Contrary to static pruning approaches that prune the prompt at once, LazyLLM allows language models to dynamically select different subsets of tokens from the context in different generation steps, even though they might be pruned in previous steps. Extensive experiments on standard datasets across various tasks demonstrate that LazyLLM is a generic method that can be seamlessly integrated with existing language models to significantly accelerate the generation without fine-tuning. For instance, in the multi-document question-answering task, LazyLLM accelerates the prefilling stage of the LLama 2 7B model by 2.34x while maintaining accuracy.
2 Comments

Like Comment Share
Hugging Face reposted this

Merve Noyan

open-sourceress at 🤗 | Google Developer Expert in Machine Learning, MSc Candidate in Data Science
4d
Report this post
PSA 🗣️ We kept shipping in June, here's some non-exhaustive Hugging Face Hub updates! See the deck below for how they look like, keep reading 🤗 📑 Datasets: - We've added new filters on modality, size and format - Easily check how to load dataset repositories to other formats (datasets, pandas and croissant) - You can now sort dataset repositories for dataset number of elements and also preview number of elements in the dataset 🤝 Community: - You can now open discussions at any organization (for anything that's not related to models or datasets they share) - If you already have more than one paper at Hugging Face papers, you can now submit a paper Tasks is a documentation project for everyone to start building with machine learning 📖 (at /tasks) 📚 Tasks: - We now have a task page for vision language models - We have completely renewed Feature Extraction task page to have retrieval, reranking, RAG & co - We have updated a ton of Tasks with new models, datasets and more

1 Comment

Like Comment Share

Browse jobs

Funding

Hugging Face 7 total rounds

Last Round

Series D Feb 16, 2024

See more info on crunchbase

Hugging Face

Software Development

The AI community building the future.

About us

Products

Hugging Face

Natural Language Processing (NLP) Software

Locations

Employees at Hugging Face

Ludovic Huraux

Bassem ASSEH

Jeff Boudier

Product + Growth at Hugging Face

Terrence Rohan

Seed Investor

Updates

Join now to see what you are missing

Similar pages

OpenAI

Anthropic

Mistral AI

Google DeepMind

Generative AI

LangChain

DeepLearning.AI

LlamaIndex

Cohere

Perplexity

Browse jobs

Scientist jobs

Analyst jobs

Engineer jobs

Machine Learning Engineer jobs

Developer jobs

Manager jobs

Librarian jobs

Intern jobs

Data Scientist jobs

Director jobs

Operational Specialist jobs

Head jobs

Data Science Specialist jobs

Software Engineer jobs

Project Manager jobs

Data Analyst jobs

Account Executive jobs

Recruiter jobs

Product Manager jobs

Frontend Developer jobs

Funding