Ƙayyade a cikin Data Mining

Ƙayyadaddun tsari ne na ƙididdigar bayanai wanda ya ba da jigogi zuwa tarin bayanai don taimakawa cikin tsinkaye da bincike sosai. Har ila yau ana kiransa wani lokaci da ake kira Tree Decision , ƙaddamarwa yana ɗaya daga cikin hanyoyi da dama da aka nufa don yin bincike na manyan ɗakunan rubutun.

Me yasa Kayanan?

Manyan bayanai masu yawa sun zama al'ada a cikin duniyar yau na "babban bayanai." Ka yi la'akari da bayanan da ke tattare da bayanan data-da-dai-dai shine asarar biliyan uku .

Facebook ne kawai ya sauko da sababbin sababbin bayanai guda ɗaya a kowace rana (kamar yadda ya faru a shekara ta 2014, a ƙarshe lokacin da ya ruwaito wadannan samfurori). Babban kalubale na manyan bayanai shine yadda za a fahimta.

Kuma ƙarar murya ba shine matsalar kawai ba: babban bayanai kuma yana nuna bambancin, bazuwa da sauri. Yi la'akari da bayanan murya da bidiyon, wuraren sadarwar kafofin watsa labarun, bayanai na 3D ko bayanan geospatial. Irin wannan bayanai ba a sauƙaƙe ko rarraba ba.

Don saduwa da wannan kalubale, an tsara hanyoyin hanyoyin atomatik don cire bayanan mai amfani, daga cikinsu akwai rarraba .

Yadda Ayyukan Kayan Gida

A haɗari na motsawa cikin fasaha, bari muyi tattauna yadda aka tsara aiki. Makasudin shine ƙirƙirar saiti na ka'idoji waɗanda za su amsa tambayoyin, yanke shawara, ko hangen nesa. Don farawa, an saita sashin horon horo wanda ya ƙunshi wasu halaye na halayen da mawuyacin sakamako.

Ayyukan gyaran algorithm shine gano yadda irin wannan halayen ya kai ga ƙarshe.

Abinda ya faru : Wataƙila kamfani na katin bashi yana ƙoƙari ya ƙayyade abin da ya kamata ya karbi kyautar katin bashi.

Wannan yana iya zama saitin horarwa:

Bayanan horo
Sunan Shekaru Gender Kuɗi na shekara Kyautar Katin Bashi
John Doe 25 M $ 39,500 A'a
Jane Doe 56 F $ 125,000 Ee

Maganin "hangen nesa" tarihin Age , Gender , da Annual Income sun kiyasta darajar "siffantaccen asali" Kyautar Katin Bashi . A cikin horon horo, an san sanannun asali. Ƙayyadaddun algorithm sa'an nan kuma yayi ƙoƙari don ƙayyade yadda darajar ma'anar annabci ta isa: wane dangantaka ke kasance tsakanin masu kallo da kuma yanke shawara? Zai ci gaba da kafa dokoki na tsinkaya, yawanci ma'anar IF / THEN, alal misali:

IF (Shekaru> 18 KO Shekaru <75) DA Rahoton Goma> 40,000 THEN Katin Bashi na Ƙari = A

Babu shakka, wannan misali ne mai sauƙi, kuma algorithm zai buƙaci samfurin samfur mafi girma fiye da rubutun biyu da aka nuna a nan. Bugu da ari, dokokin dokoki na iya zama mafi haɗari, ciki har da ƙananan dokoki don kama bayanan halayen.

Bayan haka, an ba da algorithm "jerin saiti" na bayanan da za a tantance, amma wannan saiti ba shi da asalin ra'ayi (ko yanke shawara):

Bayanan Predictor
Sunan Shekaru Gender Kuɗi na shekara Kyautar Katin Bashi
Jack Frost 42 M $ 88,000
Mary Murray 16 F $ 0

Wannan bayanan annabcin bayanan yana kiyasta daidaitattun ka'idodin hadisan, sa'annan ana bin ka'idoji har sai mai ƙaddamar ya ɗauki kwarewa da tasiri.

Kwanan nan na Ƙididdigar Kwanan wata

Ƙayyadewa, da kuma sauran fasahohin ma'adinai na bayanai, yana bayan kwarewar abubuwan da muke fuskanta yau da kullum a matsayin masu amfani.

Duniyar tsinkaye za ta iya yin amfani da rarrabawa don bayar da rahoton ko rana za ta yi ruwa, rana ko hadari. Ma'aikatar likita zata iya nazarin yanayin kiwon lafiya don hango nesa sakamakon sakamako na likita. Wani nau'i na ƙayyadewa, Naive Bayesian, yana amfani da yiwuwar yanayi don rarraba imel imel. Daga ganewar zamba ga samfurori na samfurin, rarrabawa yana bayan al'amuran yau da kullum nazarin bayanai da samar da tsinkaya.