[mod] drop fasttext-predict (#5795)
Removes the `fasttext-predict` dependency and the language detection code. If a user now selects `auto` for the search language, the detected language now falls back directly to the `Accept-Language` header sent by the browser (which was already the fallback when fasttext returned no result). - fasttext's [language detection is unreliable](https://github.com/searxng/searxng/issues/4195) for some languages, especially short search queries, and in particular for queries containing proper names which is a common case. - `fasttext-predict` consumes [significant memory](https://github.com/searxng/searxng/pull/1969#issuecomment-1345366676) without offering users much real value. - the upstream fasttext project was archived by Meta in 2024 - users already have two better alternatives: the `Accept-Language` header and the search-syntax language prefix (e.g. `:fr` or `:de`). Related: https://github.com/searxng/searxng/issues/4195 Closes: https://github.com/searxng/searxng/issues/5790
This commit is contained in:
@@ -194,27 +194,3 @@ class TestXPathUtils(SearxTestCase): # pylint: disable=missing-class-docstring
|
||||
with self.assertRaises(SearxEngineXPathException) as context:
|
||||
utils.eval_xpath_getindex(doc, 'count(//i)', 1)
|
||||
self.assertEqual(context.exception.message, 'the result is not a list')
|
||||
|
||||
def test_detect_language(self):
|
||||
# make sure new line are not an issue
|
||||
# fasttext.predict('') does not accept new line.
|
||||
l = utils.detect_language('The quick brown fox jumps over\nthe lazy dog')
|
||||
self.assertEqual(l, 'en')
|
||||
|
||||
l = utils.detect_language(
|
||||
'いろはにほへと ちりぬるを わかよたれそ つねならむ うゐのおくやま けふこえて あさきゆめみし ゑひもせす'
|
||||
)
|
||||
self.assertEqual(l, 'ja')
|
||||
|
||||
l = utils.detect_language('Pijamalı hasta yağız şoföre çabucak güvendi.')
|
||||
self.assertEqual(l, 'tr')
|
||||
|
||||
l = utils.detect_language('')
|
||||
self.assertIsNone(l)
|
||||
|
||||
# mix languages --> None
|
||||
l = utils.detect_language('The いろはにほへと Pijamalı')
|
||||
self.assertIsNone(l)
|
||||
|
||||
with self.assertRaises(ValueError):
|
||||
utils.detect_language(None) # type: ignore
|
||||
|
||||
Reference in New Issue
Block a user