Skip to content

Latest commit

 

History

History
42 lines (30 loc) · 1.06 KB

File metadata and controls

42 lines (30 loc) · 1.06 KB

OndaNet

My purpouse is to learn Data Analysis and Data Mining using data from Onda Rock, an Italian music portal.

Step 1: A network of the music

Each review page of Ondarock is conneted with the other using hyperlink. I would obtain the network using to parse the pages:

  1. Request
  2. BeautifulSoap
  3. Htlm5lib
  4. nltk

To store and analyse the net:

  1. NetworkX

To plot the data

  1. D3

Step 1.5: find clusters

There are clusters? And these follow the division based on music gender?

Step 2: It's better store the data

I will chose a way to store and organize data, for example a DB, like Mongo o Couch. Any information is precious, like votes or the page reviewer

Step 3: Other data analysis

I would use also Pandas to charge the data
to analise. After I can think about to search correlation between data, or to developt a method to sugest me some music that I don't know but that will be like.