Spacy - v3.5.1


💥 We'd love to hear more about your experience with spaCy! Take our survey here.

✨ New features and improvements

  • NEW: spancat_singlelabel pipeline component for multi-class and non-overlapping span classification. The spancat_singlelabel component predicts at most one label for each suggested span and adds a new setting allow_overlap to restrict the output to non-overlapping spans (#11365).
  • Extend to mypy v1.0 (#12245).
  • Use transformer + CNN for efficient GPU textcat with spacy init config (#11900).
  • Support trainable lemmatizer in spacy debug data (#11419).
  • Add new operators to dependency matcher for left/right immediate child/parent nodes (>+, >-, <+, <-) (#12334).
  • Add spacy.PlainTextCorpusReader.v1 for plain text input (#12122).
  • Add alignment_mode and span_id to Span.char_span() (#12145, #12196).
  • Use string formatting types in logging calls (#12215).

🔴 Bug fixes

  • 12017: Improve speed for top_k>1 in trainable lemmatizer.

  • 12048: Make test_cli_find_threshold() test more robust.

  • 12227: Fix return type of registry.find().

  • 12272: Fix speed regression for Matcher patterns with extension attributes.

  • 12287: Add grc to languages with lexeme norms in spacy-lookups-data.

  • 12320: Make generation of empty KnowledgeBase instances configurable.

  • 12343: Fix error message for displacy auto_select_port.

  • 12347: Fix length check for knowledge base in entity linker, add InMemoryLookupKB.is_empty.

  • 12365: Fix types for Lexeme.orth and Lexeme.lower.

  • 12366: Raise error for non-default vectors with PretrainVectors.

  • 12368: Partially address pending deprecation of pkg_resources.

  • Various improvements and fixes for the test suite (#12148, #12157, #12210, #12303, #12372).

📖 Documentation and examples

👥 Contributors

@adrianeboyd, @andyjessen, @danieldk, @essenmitsosse, @honnibal, @ines, @itssimon, @kadarakos, @kwhumphreys, @ljvmiranda921, @pmbaumgartner, @polm, @richardpaulhudson, @rmitsch, @shadeMe, @svlandeg, @tanloong, @thomashacker, @victorialslocum


Details

date
March 10, 2023, 9:02 a.m.
name
v3.5.1: spancat for multi-class labeling, fixes for textcat+transformers and more
type
Patch
👇
Register or login to:
  • 🔍View and search all Spacy releases.
  • 🛠️Create and share lists to track your tools.
  • 🚨Setup notifications for major, security, feature or patch updates.
  • 🚀Much more coming soon!
Continue with GitHub
Continue with Google
or