161

what is difference between spacy.load('en_core_web_sm') and spacy.load('en')? This link explains different model sizes. But i am still not clear how spacy.load('en_core_web_sm') and spacy.load('en') differ

spacy.load('en') runs fine for me. But the spacy.load('en_core_web_sm') throws error

i have installed spacyas below. when i go to jupyter notebook and run command nlp = spacy.load('en_core_web_sm') I get the below error

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-4-b472bef03043> in <module>()
      1 # Import spaCy and load the language library
      2 import spacy
----> 3 nlp = spacy.load('en_core_web_sm')
      4 
      5 # Create a Doc object

C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\spacy\__init__.py in load(name, **overrides)
     13     if depr_path not in (True, False, None):
     14         deprecation_warning(Warnings.W001.format(path=depr_path))
---> 15     return util.load_model(name, **overrides)
     16 
     17 

C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\spacy\util.py in load_model(name, **overrides)
    117     elif hasattr(name, 'exists'):  # Path or Path-like to model data
    118         return load_model_from_path(name, **overrides)
--> 119     raise IOError(Errors.E050.format(name=name))
    120 
    121 

OSError: [E050] Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

how I installed Spacy ---

(C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz>conda install -c conda-forge spacy
Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder:

The following NEW packages will be INSTALLED:

    blas:           1.0-mkl
    cymem:          1.31.2-py35h6538335_0    conda-forge
    dill:           0.2.8.2-py35_0           conda-forge
    msgpack-numpy:  0.4.4.2-py_0             conda-forge
    murmurhash:     0.28.0-py35h6538335_1000 conda-forge
    plac:           0.9.6-py_1               conda-forge
    preshed:        1.0.0-py35h6538335_0     conda-forge
    pyreadline:     2.1-py35_1000            conda-forge
    regex:          2017.11.09-py35_0        conda-forge
    spacy:          2.0.12-py35h830ac7b_0    conda-forge
    termcolor:      1.1.0-py_2               conda-forge
    thinc:          6.10.3-py35h830ac7b_2    conda-forge
    tqdm:           4.29.1-py_0              conda-forge
    ujson:          1.35-py35hfa6e2cd_1001   conda-forge

The following packages will be UPDATED:

    msgpack-python: 0.4.8-py35_0                         --> 0.5.6-py35he980bc4_3 conda-forge

The following packages will be DOWNGRADED:

    freetype:       2.7-vc14_2               conda-forge --> 2.5.5-vc14_2

Proceed ([y]/n)? y

blas-1.0-mkl.t 100% |###############################| Time: 0:00:00   0.00  B/s
cymem-1.31.2-p 100% |###############################| Time: 0:00:00   1.65 MB/s
msgpack-python 100% |###############################| Time: 0:00:00   5.37 MB/s
murmurhash-0.2 100% |###############################| Time: 0:00:00   1.49 MB/s
plac-0.9.6-py_ 100% |###############################| Time: 0:00:00   0.00  B/s
pyreadline-2.1 100% |###############################| Time: 0:00:00   4.62 MB/s
regex-2017.11. 100% |###############################| Time: 0:00:00   3.31 MB/s
termcolor-1.1. 100% |###############################| Time: 0:00:00 187.81 kB/s
tqdm-4.29.1-py 100% |###############################| Time: 0:00:00   2.51 MB/s
ujson-1.35-py3 100% |###############################| Time: 0:00:00   1.66 MB/s
dill-0.2.8.2-p 100% |###############################| Time: 0:00:00   4.34 MB/s
msgpack-numpy- 100% |###############################| Time: 0:00:00   0.00  B/s
preshed-1.0.0- 100% |###############################| Time: 0:00:00   0.00  B/s
thinc-6.10.3-p 100% |###############################| Time: 0:00:00   5.49 MB/s
spacy-2.0.12-p 100% |###############################| Time: 0:00:10   7.42 MB/s

(C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz>python -V
Python 3.5.3 :: Anaconda custom (64-bit)

(C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz>python -m spacy download en
Collecting en_core_web_sm==2.0.0 from https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm==2.0.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz (37.4MB)
    100% |################################| 37.4MB ...
Installing collected packages: en-core-web-sm
  Running setup.py install for en-core-web-sm ... done
Successfully installed en-core-web-sm-2.0.0

    Linking successful
    C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\en_core_web_sm
    -->
    C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder\lib\site-packages\spacy\data\en

    You can now load the model via spacy.load('en')


(C:\Users\nikhizzz\AppData\Local\conda\conda\envs\tensorflowspyder) C:\Users\nikhizzz>
3
  • 11
    I have a couple of possible ideas where the issue is.. First, try to re-download the model: python -m spacy download en_core_web_sm Commented Jan 28, 2019 at 14:37
  • 2
    By the way, 'en' defaults to 'en_core_web_sm', so they are actually identical. See this. Commented Jan 28, 2019 at 14:43
  • 3
    Just execution of python -m spacy download en_core_web_sm command is more than enough Commented Aug 15, 2021 at 18:55

34 Answers 34

246

Initially I downloaded two en packages using following statements in anaconda prompt.

python -m spacy download en_core_web_lg
python -m spacy download en_core_web_sm

But, I kept on getting linkage error and finally running below command helped me to establish link and solved error.

python -m spacy download en

Also make sure you to restart your runtime if working with Jupyter. -PS : If you get linkage error try giving admin previlages.

3
  • 7
    This worked for me too. In terminal, on my MacBook, I ran both python -m spacy download en_core_web_lg and python -m spacy download en_core_web_sm but continued to get the OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory. error in Python when entering spacy.load('en') and spacy.load('en_core_web_lg'). Running python -m spacy download en in terminal and then spacy.load('en') in Python, I was able to load the model. Commented Oct 15, 2019 at 22:30
  • I was still unable to fix the issue until @talha-tayyab suggested a runtime restart. Kudoz to all of you, guys! stackoverflow.com/a/68012945/537648 Commented Sep 18, 2021 at 18:50
  • When I try this, it gives me a syntax error for the word spacy
    – user16951074
    Commented Feb 11, 2022 at 20:47
92
+50

The answer to your misunderstanding is a Unix concept, softlinks which we could say that in Windows are similar to shortcuts. Let's explain this.

When you spacy download en, spaCy tries to find the best small model that matches your spaCy distribution. The small model that I am talking about defaults to en_core_web_sm which can be found in different variations which correspond to the different spaCy versions (for example spacy, spacy-nightly have en_core_web_sm of different sizes).

When spaCy finds the best model for you, it downloads it and then links the name en to the package it downloaded, e.g. en_core_web_sm. That basically means that whenever you refer to en you will be referring to en_core_web_sm. In other words, en after linking is not a "real" package, is just a name for en_core_web_sm.

However, it doesn't work the other way. You can't refer directly to en_core_web_sm because your system doesn't know you have it installed. When you did spacy download en you basically did a pip install. So pip knows that you have a package named en installed for your python distribution, but knows nothing about the package en_core_web_sm. This package is just replacing package en when you import it, which means that package en is just a softlink to en_core_web_sm.

Of course, you can directly download en_core_web_sm, using the command: python -m spacy download en_core_web_sm, or you can even link the name en to other models as well. For example, you could do python -m spacy download en_core_web_lg and then python -m spacy link en_core_web_lg en. That would make en a name for en_core_web_lg, which is a large spaCy model for the English language.

Hope it is clear now :)

1
  • 3
    Yes, it works! I'm using Spacy with a MacBook pro. What I needed was the module en_core_web_lg
    – AlketCecaj
    Commented Jul 3, 2019 at 16:46
42

The below worked for me :

import en_core_web_sm

nlp = en_core_web_sm.load()
2
  • incase if you anyone got an error (No module named 'en_core_web_sm') after trying the above command then also execute answer Commented Aug 15, 2021 at 19:25
  • If this solution works for you then you can find out the path of the module using print(en_core_web_sm.__file__) and check that it is installed in the correct version of python that you want to be using
    – Zhu Weiji
    Commented Jun 7, 2022 at 15:22
22

Using the Spacy language model in Colab requires only the following two steps:

  1. Download the model (change the name according to the size of the model)
!python -m spacy download en_core_web_lg 
  1. Restart the colab runtime! Perform shortcut key: Ctrl + M + .

Test

import spacy
nlp = spacy.load("en_core_web_lg")

successful!!!

19

For those who are still facing problems even after installing it as administrator from Anaconda prompt, here's a quick fix:

  1. Got to the path where it is downloaded. For e.g.

    C:\Users\name\AppData\Local\Continuum\anaconda3\Lib\site-packages\en_core_web_sm\en_core_web_sm-2.2.0
    
  2. Copy the path.

  3. Paste it in:

    nlp = spacy.load(r'C:\Users\name\AppData\Local\Continuum\anaconda3\Lib\site-packages\en_core_web_sm\en_core_web_sm-2.2.0')
    
  4. Works like a charm :)

PS: Check for spacy version

2
  • I pasted in but got error OSError: [E049] Can't find spaCy data directory: 'None'. Check your installation and permissions, or use spacy.util.set_data_path to customise the location if necessary., do you know what's going on?
    – wawawa
    Commented Jul 8, 2021 at 19:47
  • This solution worekd for Google Colab as well.
    – B-Abbasi
    Commented Oct 11, 2022 at 13:58
18

Don't run !python -m spacy download en_core_web_lg from inside jupyter. Do this instead:

import spacy.cli
spacy.cli.download("en_core_web_lg")

You may need to restart the kernel before running the above two commands for it to work.

1
  • This worked beautifully. I didn't even have to restart the kernel! Only "issue" is that it will redownload the model each time the cell is executed. Why do we have to do this instead of running !python -m spacy download en_core_web_lg?
    – jspinella
    Commented Apr 30 at 20:07
17

Try this method as this worked like a charm to me:

In your Anaconda Prompt, run the command:

!python -m spacy download en

After running the above command, you should be able to execute the below in your jupyter notebook:

spacy.load('en_core_web_sm')
1
  • @ Pallavi Banerjee.. Your answer is not significantly different from Tarun Reddy's answer. Commented Aug 2, 2022 at 5:24
7

Steps to load up modules based on different versions of spacy

download the best-matching version of a specific model for your spaCy installation

python -m spacy download en_core_web_sm
pip install .tar.gz archive from path or URL
pip install /Users/you/en_core_web_sm-2.2.0.tar.gz

or

pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Add to your requirements file or environment yaml file. Theres range of version that one spacy version is comptable with you can view more under https://github.com/explosion/spacy-models/releases

if your not sure running below code

nlp = spacy.load('en_core_web_sm') 

will give off a warning telling what version model will be compatible with your installed spacy verion

enironment.yml example

name: root
channels:
  - defaults
  - conda-forge
  - anaconda
dependencies:
  - python=3.8.3
  - pip
  - spacy=2.3.2
  - scikit-learn=0.23.2
  - pip:
    - https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz#egg=en_core_web_sm
5
import spacy

nlp = spacy.load('/opt/anaconda3/envs/NLPENV/lib/python3.7/site-packages/en_core_web_sm/en_core_web_sm-2.3.1')

Try giving the absolute path of the package with the version as shown in the image.

It works perfectly fine.

5

I am running Jupyter Notebook on Windows.

Finally, its a version issue, Need to execute below commands in conda cmd prompt( open as admin)

  • pip install spacy==2.3.5

  • python -m spacy download en_core_web_sm

  • python -m spacy download en

from chatterbot import ChatBot
import spacy
import en_core_web_sm
nlp = en_core_web_sm.load()
ChatBot("hello")

Output - enter image description here

5

a simple solution for this which I saw on spacy.io

from spacy.lang.en import English
nlp=English()

https://course.spacy.io/en/chapter1

2
  • Can anyone please tell if this is correct or not? because it worked and nothing else above. Commented May 6, 2021 at 14:01
  • Yes! This works in PyCharm. Not sure why they show load method and then this version, like why is there two load methods.
    – Rob
    Commented Jul 19, 2021 at 17:38
4

First of all, install spacy using the following command for jupyter notebook pip install -U spacy

Then write the following code:

import en_core_web_sm
nlp = en_core_web_sm.load()
4

As for Windows based Anaconda,

  1. Open Anaconda Prompt

  2. Activate your environment. Ex: active myspacyenv

  3. pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

  4. python -m spacy download en_core_web_sm

  5. Open Jupyter Notebook ex: active myspacyenv and then jupyter notebook on Anaconda Promt

import spacy spacy.load('en_core_web_sm')

and it will run peacefully!

1
  • fyi for step #2 its conda activate myspacyenv.
    – Yev Guyduy
    Commented Jan 21, 2021 at 20:16
4

Best is to follow the official spacy docs for installation (https://spacy.io/usage):

First uninstall your current spacy version

pip uninstall spacy

Then install pacy correctly

pip install -U pip setuptools wheel
pip install -U spacy
python -m spacy download en_core_web_sm
3

Open Anaconda Navigator. Click on any IDE. Run the code:

!pip install -U spacy download en_core_web_sm
!pip install -U spacy download en_core_web_sm

It will work. If you are open IDE directly close it and follow this procedure once.

3

This will work-

try:
    nlp = spacy.load("en_core_web_trf")
except:
    print("Downloading spaCy NLP model...")
    print("This may take a few minutes and it's one time process...")
    os.system(
        "pip install https://huggingface.co/spacy/en_core_web_trf/resolve/main/en_core_web_trf-any-py3-none-any.whl")
    nlp = spacy.load("en_core_web_trf")

Example / How to use-

import spacy
import os

try:
    nlp = spacy.load("en_core_web_trf")
except:
    print("Downloading spaCy NLP model...")
    print("This may take a few minutes and it's one time process...")
    os.system("pip install https://huggingface.co/spacy/en_core_web_trf/resolve/main/en_core_web_trf-any-py3-none-any.whl")
    nlp = spacy.load("en_core_web_trf")


def perform_ner(*args, **kwargs):
    query = kwargs['query']
    # Process the input text with spaCy NLP model
    doc = nlp(query)

    # Extract named entities and categorize them
    entities = [(entity.text, entity.label_) for entity in doc.ents]

    return entities


if __name__ == "__main__":
    # Example input text
    input_text = "I want to buy a new iPhone 12 Pro Max from Apple."

    # Perform NER on input text
    entities = perform_ner(query=input_text)

    # Print the extracted entities
    print(entities)
2

Loading the module using the different syntax worked for me.

import en_core_web_sm
nlp = en_core_web_sm.load()
2

Anaconda Users

  1. If you're using a conda virtual environment, be sure that its the same version of Python as that in your base environment. To verify this, run python --version in each environment. If not the same, create a new virtual environment with that version of Python (Ex. conda create --name myenv python=x.x.x).

  2. Activate the virtual environment (conda activate myenv)

  3. conda install -c conda-forge spacy
  4. python -m spacy download en_core_web_sm

I just ran into this issue, and the above worked for me. This addresses the issue of the download occurring in an area that is not accessible to your current virtual environment.

You should then be able to run the following:

import spacy
nlp = spacy.load("en_core_web_sm")
2

Open command prompt or terminal and execute the below code:

pip3 install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.2.0/en_core_web_sm-2.2.0.tar.gz

Execute the below chunk in your Jupiter notebook.

import spacy

nlp = spacy.load('en_core_web_sm')

Hope the above code works for all:)

2

I had also same issue as I couldnt load module using '''spacy.load()''' You can follow below steps to solve this on windows:

  1. download using !python -m spacy download en_core_web_sm
  2. import en_core_web_sm as import en_core_web_sm
  3. load using en_core_web_sm.load() to some variable

Complete code will be:

python -m spacy download en_core_web_sm

import en_core_web_sm

nlp = en_core_web_sm.load()
2

Instead of any of the above, this solved my error.

conda install -c conda-forge spacy-model-en_core_web_sm

If you are an anaconda user, this is the solution.

1

This works with colab:

!python -m spacy download en
import en_core_web_sm
nlp = en_core_web_sm.load()

Or for the medium:

import en_core_web_md
nlp = en_core_web_md.load()
0
1

I'm running PyCharm on MacOS and while none of the above answers completely worked for me, they did provide enough clues and I was finally able to everything working. I am connecting to an ec2 instance and have configured PyCharm such that I can edit on my Mac and it automatically updates the files on my ec2 instance. Thus, the problem was on the ec2 side where it was not finding Spacy even though I installed it several different times and ways. If I ran my python script from the command line, everything worked fine. However, from within PyCharm, it was initially not finding Spacy and the models. I eventually fixed the "finding" spacy issue using the above recommendation of adding a "requirements.txt" file. But the models were still not recognized.

My solution: download the models manually and place them in the file system on the ec2 instance and explicitly point to them when loaded. I downloaded the files from here:

https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz

https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.0.0/en_core_web_lg-3.0.0.tar.gz

After downloading, I dropped moved them to my ec2 instance, decompressed and untared them in my filesystem, e.g. /path_to_models/en_core_web_lg-3.0.0/

I then load a model using the explicit path and it worked from within PyCharm (note the path used goes all the way to en_core_web_lg-3.0.0; you will get an error if you do not use the folder with the config.cfg file):

nlpObject = spacy.load('/path_to_models/en_core_web_lg-3.0.0/en_core_web_lg/en_core_web_lg-3.0.0')
1
  • awesome, this was what I was looking for. Have you found a way to cURL the tar files down? I think github does some kind of redirect to prevent people using it as a file storage.
    – dcsan
    Commented May 31, 2021 at 2:11
1

Check installed version of spacy pip show spacy You will get something like this:

Name: spacy Version: 3.1.3 Summary: Industrial-strength Natural Language Processing (NLP) in Python

Install the relevant version of the model using: !pip install -U https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz

0

I tried all the above answers but could not succeed. Below worked for me :

(Specific to WINDOWS os)

  1. Run anaconda command prompt with admin privilege(Important)
  2. Then run below commands:
  pip install -U --user spacy    
  python -m spacy download en
  1. Try below command for verification:
import spacy
spacy.load('en')
  1. It might work for others versions as well: enter image description here
0

If you have already downloaded spacy and the language model (E.g., en_core_web_sm or en_core_web_md), then you can follow these steps:

  1. Open Anaconda prompt as admin

  2. Then type : python -m spacy link [package name or path] [shortcut]

    For E.g., python -m spacy link /Users/you/model en

This will create a symlink to the your language model. Now you can load the model using spacy.load("en") in your notebooks or scripts

0

This is what I did:

  1. Went to the virtual environment where I was working on Anaconda Prompt / Command Line

  2. Ran this: python -m spacy download en_core_web_sm

And was done

0

TRY THIS :-
!python -m spacy download en_core_web_md

0

Even I faced similar issue. How I resolved it

  1. start anaconda prompt in admin mode.
  2. installed both python -m spacy download en and python -m spacy download en_core_web_sm after above steps only I started jupyter notebook where I am accessing this package. Now I can access both import spacy nlp = spacy.load('en_core_web_sm') or nlp = spacy.load('en') Both are working for me.
0

I faced a similar issue. I installed spacy and en_core_web_sm from a specific conda environment. However, I got two(02) differents issues as following:

[Errno 2] No such file or directory: '....\en_core_web_sm\en_core_web_sm-2.3.1\vocab\lexemes.bin' or OSError: [E050] Can't find model 'en_core_web_sm'.... It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory.

I did the following:

  1. Open Command Prompt as Administrator
  2. Go to c:>
  3. Activate my Conda environment (If you work in a specific conda environment):
c:\>activate <conda environment name>
  1. (conda environment name)c:\>python -m spacy download en
  2. Return to Jupyter Notebook and you can load the language library:
nlp = en_core_web_sm.load()

For me, it works :)

Not the answer you're looking for? Browse other questions tagged or ask your own question.