About: Bloom filters in bioinformatics

Property	Value
dbo:abstract	مرشحات بلوم هي هياكل بيانات احتمالية موفرة للمساحة تُستخدم لاختبار ما، إذا كان العنصر جزءًا من مجموعة. تتطلب فلاتر بلوم مساحة أقل بكثير من هياكل البيانات الأخرى؛ لتمثيل المجموعات، ولكن الجانب السلبي لفلاتر بلوم هو أن هناك معدل إيجابي كاذب عند الاستعلام عن بنية البيانات. نظرًا لأن العناصر المتعددة قد يكون لها نفس قيم التجزئة لعدد من وظائف التجزئة، فهناك احتمال أن يؤدي الاستعلام عن عنصر غير موجود إلى إرجاع عنصر إيجابي إذا تمت إضافة عنصر آخر بنفس قيم التجزئة إلى مرشح (Bloom). بافتراض أن دالة التجزئة لها احتمالية متساوية لاختيار أي فهرس لمرشح بلوم، فإن المعدل الإيجابي الكاذب للاستعلام عن مرشح بلوم هو دالة لعدد البتات وعدد وظائف التجزئة وعدد عناصر مرشح بلوم. يسمح هذا للمستخدم بإدارة مخاطر الحصول على نتيجة إيجابية خاطئة من خلال المساومة على مزايا المساحة لمرشح بلوم. تستخدم مرشحات بلوم في المقام الأول في المعلوماتية الحيوية لاختبار وجود k-mer في تسلسل أو مجموعة من التسلسلات. يتم فهرسة k-mers للتسلسل في مرشح (Bloom)، ويمكن الاستعلام عن أي k-mer من نفس الحجم مقابل مرشح Bloom. هذا هو البديل المفضل لتجزئة k-mers في تسلسل مع جدول تجزئة، خاصة عندما يكون التسلسل طويلًا جدًا، حيث يتطلب تخزين أعداد كبيرة من k-mers في الذاكرة. (ar) Bloom filters are space-efficient probabilistic data structures used to test whether an element is a part of a set. Bloom filters require much less space than other data structures for representing sets, however the downside of Bloom filters is that there is a false positive rate when querying the data structure. Since multiple elements may have the same hash values for a number of hash functions, then there is a probability that querying for a non-existent element may return a positive if another element with the same hash values has been added to the Bloom filter. Assuming that the hash function has equal probability of selecting any index of the Bloom filter, the false positive rate of querying a Bloom filter is a function of the number of bits, number of hash functions and number of elements of the Bloom filter. This allows the user to manage the risk of a getting a false positive by compromising on the space benefits of the Bloom filter. Bloom filters are primarily used in bioinformatics to test the existence of a k-mer in a sequence or set of sequences. The k-mers of the sequence are indexed in a Bloom filter, and any k-mer of the same size can be queried against the Bloom filter. This is a preferable alternative to hashing the k-mers of a sequence with a hash table, particularly when the sequence is very long, since it is very demanding to store large numbers of k-mers in memory. (en)
dbo:thumbnail	wiki-commons:Special:FilePath/Querying_bloom_filter.svg?width=300
dbo:wikiPageID	60104849 (xsd:integer)
dbo:wikiPageLength	13286 (xsd:nonNegativeInteger)
dbo:wikiPageRevisionID	978647736 (xsd:integer)
dbo:wikiPageWikiLink	dbr:BLAST dbr:Next-generation_sequencing dbr:Metagenomics dbr:Bloom_filter dbr:De_Bruijn_graph dbc:Bioinformatics dbr:Data_structure dbr:Document_retrieval dbr:Hash_function dbr:K-mer dbr:RNA-Seq dbr:DNA_sequencing dbr:False_positive_rate dbr:Hash_table dbr:Reference_genome dbr:Sequence_Read_Archive dbr:European_Nucleotide_Archive dbr:Sanger_sequencing dbr:Genome_assembly dbr:File:Comparing_de_brujin_graph_in_memory.svg dbr:File:Querying_bloom_filter.svg
dbp:wikiPageUsesTemplate	dbt:Reflist
dct:subject	dbc:Bioinformatics
rdfs:comment	مرشحات بلوم هي هياكل بيانات احتمالية موفرة للمساحة تُستخدم لاختبار ما، إذا كان العنصر جزءًا من مجموعة. تتطلب فلاتر بلوم مساحة أقل بكثير من هياكل البيانات الأخرى؛ لتمثيل المجموعات، ولكن الجانب السلبي لفلاتر بلوم هو أن هناك معدل إيجابي كاذب عند الاستعلام عن بنية البيانات. نظرًا لأن العناصر المتعددة قد يكون لها نفس قيم التجزئة لعدد من وظائف التجزئة، فهناك احتمال أن يؤدي الاستعلام عن عنصر غير موجود إلى إرجاع عنصر إيجابي إذا تمت إضافة عنصر آخر بنفس قيم التجزئة إلى مرشح (Bloom). بافتراض أن دالة التجزئة لها احتمالية متساوية لاختيار أي فهرس لمرشح بلوم، فإن المعدل الإيجابي الكاذب للاستعلام عن مرشح بلوم هو دالة لعدد البتات وعدد وظائف التجزئة وعدد عناصر مرشح بلوم. يسمح هذا للمستخدم بإدارة مخاطر الحصول على نتيجة إيجابية خاطئة من خلال المساومة على مزايا المساحة لمرشح بلوم. (ar) Bloom filters are space-efficient probabilistic data structures used to test whether an element is a part of a set. Bloom filters require much less space than other data structures for representing sets, however the downside of Bloom filters is that there is a false positive rate when querying the data structure. Since multiple elements may have the same hash values for a number of hash functions, then there is a probability that querying for a non-existent element may return a positive if another element with the same hash values has been added to the Bloom filter. Assuming that the hash function has equal probability of selecting any index of the Bloom filter, the false positive rate of querying a Bloom filter is a function of the number of bits, number of hash functions and number of el (en)
rdfs:label	مرشحات بلوم في المعلوماتية الحيوية (ar) Bloom filters in bioinformatics (en)
owl:sameAs	wikidata:Bloom filters in bioinformatics dbpedia-ar:Bloom filters in bioinformatics dbpedia-fa:Bloom filters in bioinformatics https://global.dbpedia.org/id/AMeAp
prov:wasDerivedFrom	wikipedia-en:Bloom_filters_in_bioinformatics?oldid=978647736&ns=0
foaf:depiction	wiki-commons:Special:FilePath/Comparing_de_brujin_graph_in_memory.svg wiki-commons:Special:FilePath/Querying_bloom_filter.svg
foaf:isPrimaryTopicOf	wikipedia-en:Bloom_filters_in_bioinformatics
is dbo:wikiPageRedirects of	dbr:Bloom_filters_in_Bioinformatics
is dbo:wikiPageWikiLink of	dbr:Bloom_filter dbr:Bloom_filters_in_Bioinformatics
is foaf:primaryTopic of	wikipedia-en:Bloom_filters_in_bioinformatics