I can’t believe it that I get to use stats on the job. (Note to NinjaRat… This is what you use probability bullshit for! Businesses use it. The government uses it. The military uses it. Our modern world runs on (some of) this shit, …sort of…)
The problem is I have a lot of data related to performance of a website, and they just let me do whatever to try to figure it out, which is pretty cool.
After reading about R on slashdot I decided that it sounded like just the thing I needed to save myself from certain doom. ANd more joyous is that fact that there is this kickass Python binding for it, called rpy2. It kicks ass. I can do stuff all kinds of tests for less than 100 lines of code (I can probably compress even that by leveraging Python’s functional features, more so than what I’m doing now). This is not exactly correct, since I evaluated R a long, long time ago during some school bullshit. Hahahahaha.
As of now I’m thinking about doing some PCA too see for some observations and rank “relevance” of other factors which affects it. For my situation, I know for a fact that the observations depend on these factors!
At least that’s what I’m assuming. Need to think more about this though.
I also want “try” to use statistical process control. Don’t know exactly how that may work tho, yet.