So, you’ve got a data set, you say? Some points And you’re not sure what do with them. Maybe you have a set of x and y coordinates?
In my case, I had a set of coordinates, each representing the number of an DJs followers on SoundCloud. I had another set of coordinates that represented an DJ’s cost per night.
I wanted to be able to use this data set to be able to predict given an artist’s Soundcloud followers count, how much they might cost! So which set is the x and which is the y? And how do I go about this?
A great Rubyist once said, now we’re in math world, so let’s go to the Google.
I found these two sites, which combined helped me make my algorithm pretty easily.
The data we have is x and the data we are predicting would be y.
In this website all you do is plug in your x, y coordinates and it creates for lines of best fit.
http://www.had2know.com/academics/regression-calculator-statistics-best-fit.html
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Then you can use this site to graph them - which isn’t technically necessary but it helped me see which type of line seemed to make sense.
http://www.mathsisfun.com/data/grapher-equation.html
It’s pretty neat and helped me discover the following method aka my predictive algorithm. (sdcl_followers is an attribute of a Dj.)
1 2 3 4 5 6 7 |
|
That all said, this algorithm is pretty shit. It turns out soundcloud followers isn’t the greatest determiner of cost/night. I’ll need to take some other factors into account.
Hope this is helpful.