In this post I am going to discuss how to write word count program in Hive.
Assume we have data in our table like below
This is a Hadoop Post
and Hadoop is a big data technology
and we want to generate word count like below
a 2
and 1
Big 1
data 1
Hadoop 2
is 2
Post 1
technology 1
This 1
Now we will learn how to write program for the same.
1.Convert sentence into words
the data we have is in sentences,first we have to convert that it into words applying space as delimiter.we have to use split function of hive.
split (sentence ,' ')
2.Convert column into rows
Now we have array of strings like this
[This,is,a,hadoop,Post]
but we have to convert it into multiple rows like below
This
is
a
hadoop
Post
I mean we have to convert every line of data into multiple rows ,for this we have function called explode in hive and this is also called table generating function.
SELECT explode(split(sentence, ' ')) AS word FROM texttable
and create above output as intermediate table.
(SELECT explode(split(sentence, ' ')) AS word FROM texttable)tempTable
after second step you should get output like below
a
a
and
Big
data
Hadoop
Hadoop
is
is
Post
technology
This
3.Apply group by
after second step , it is straight forward ,we have to apply group by to count word occurrences.
select word,count(1) as count from
(SELECT explode(split(sentence, ' ')) AS word FROM texttable)tempTable
group by word
thank you sir..till now never think of word count using hive
ReplyDeleteAivivu chuyên vé máy bay, tham khảo
ReplyDeletegia ve may bay vietjet tu han quoc ve viet nam
đặt vé máy bay hải phòng sài gòn
vé máy bay sài gòn hà nội hôm nay
vé máy bay hải phòng nha trang vietjet
săn vé máy bay giá rẻ đi Mỹ
This is very informative and nice post. House Painter Schaumburg, Il
ReplyDeleteWow. This is brilliant. Thanks for your help!
ReplyDeleteMaria | Owensboro Drywall Contractors
Good article about hadoop technology You may like Updated content at Hadoop Quiz all about hadoop
ReplyDeleteThanks for making this blog so informative. www.assistedonlinefilings.com
ReplyDeleteThis article has definitely given me a lot to think about. I am not sure where I stand on the issue yet, but I am grateful for the author's insights.
ReplyDeleteTampa SEO
https://bayanlarsitesi.com/
ReplyDeleteOrdu
Kocaeli
Düzce
Osmaniye
GOW
Glad to check this post, great content indeed. Zion Roof Pros Roof Repair
ReplyDeleteankara parça eşya taşıma
ReplyDeletetakipçi satın al
antalya rent a car
antalya rent a car
ankara parça eşya taşıma
2874QW
093CC
ReplyDeleteKocaeli Lojistik
Bolu Evden Eve Nakliyat
Hatay Evden Eve Nakliyat
Yalova Parça Eşya Taşıma
Afyon Evden Eve Nakliyat
0D57B
ReplyDeleteGümüşhane Evden Eve Nakliyat
Antalya Lojistik
Bitlis Evden Eve Nakliyat
Iğdır Şehir İçi Nakliyat
Antalya Evden Eve Nakliyat
Amasya Evden Eve Nakliyat
Kripto Para Borsaları
Silivri Boya Ustası
Çerkezköy Evden Eve Nakliyat
90694
ReplyDelete%20 indirim kodu
Thanks for this information you shared. brick masonry
ReplyDeleteGlad to check this site, thank you for this great content you shared. renovation plastering
ReplyDelete5F198
ReplyDeletebinance referans kimliği nedir
probit
probit
btcturk
en düşük komisyonlu kripto borsası
vindax
mercatox
bitcoin hesabı nasıl açılır
btcturk
58881
ReplyDeletekızlarla canlı sohbet
kripto para telegram
gate io
bitget
okex
binance referans kimliği nedir
bitcoin nasıl kazanılır
canlı sohbet odaları
bitcoin haram mı
Interesting blog! Thanks for taking the time in sharing this post. Grapevine Masonry Grapevine TX
ReplyDeleteThanks for the informative content!
ReplyDeleteconcrete company
I mean we have to convert every line of data from Castle Drywall in Winston Salem into multiple rows ,for this we have function called explode in hive and this is also called table generating function.
ReplyDeleteThank you for the information you shared.
ReplyDeletedriveway resurfacing
Thanks for the great content! epoxy flooring brisbane
ReplyDelete