Jump to content

Jurgis

Member
  • Posts

    6,042
  • Joined

  • Last visited

Everything posted by Jurgis

  1. Maybe not a bad thread to post this: https://www.wired.com/story/notpetya-cyberattack-ukraine-russia-code-crashed-the-world/
  2. There are mandatory vehicle inspections in some states in the US as well. NY and MA have mandatory vehicle inspections and in fact, cars tend to be in a better shape than in CA for example where there are none. I am fairly certain that the average car age is higher when vehicle inspections are absent. What? Of course there mandatory car inspections in CA for emissions ( https://www.dmv.org/ca-california/smog-check.php ) Maybe you mean mechanical/safety inspections? But mechanical inspections are not as strict in MA as in EU and they don't check much. OTOH emission inspections are presumably stricter in CA (not sure about MA) than in (most?) EU.
  3. Jurgis

    GDPR

    I'm in EU right now and it's rather funny how different websites try to be GDPR compliant. I am not a lawyer, but I'd guess more than 80% of ones who try are not really compliant (hint: AFAIK, having a banner "Accept our cookies or else" is not GDPR compliant...). OTOH, there are websites that are likely compliant. And you don't see the behavior they present to EU users when you access their sites in US. And then there are sites that clearly don't give a *&%^ about anybody accessing them from EU. Likely no enforcement will be coming... unless the website is big and gets sued or whatever.
  4. Some fun and controversial thoughts, quotes and book references: https://www.newyorker.com/magazine/2018/07/23/can-economists-and-humanists-ever-be-friends
  5. This reraises the question I had above in the thread. Why not pre-plan and not bump into 5%/10% limits and have to sell? Seems like these guys need some investment planning people or something... (And as above, yeah, in bull market maybe you leave money on the table by taking position that does not bump into 5/10 limits. But you won't always have gains when selling is forced).
  6. You should just take it for a test drive. 8)
  7. I wonder what you guys think about BRK taking (very close to) 5% position in companies and having to sell down to 5% due to share buybacks. Isn't this very short term investing? I.e. they pretty much know the company will be buying back shares and they know that they will have to sell down within a quarter, so their holding time is a quarter or two for that slice of position. Is it a good investment? (Yeah, if there's bull market in that stock, it is, but Buffett would not buy expecting single quarter bull market, would he?). Why not buy 4.8% or some other number where they would not be forced to sell within short time frame? Not that this matters a lot, but I just wonder...
  8. Now I know why people don't like my couture. 8)
  9. Another article about biases and de-biasing: https://www.theatlantic.com/magazine/archive/2018/09/cognitive-bias/565775/ Kahneman again. Tetlock again. But also Nisbett (who I haven't seen mentioned before) and some good links and pointers.
  10. Looking at this, it might be that most (all?) brokers will join the club of not allowing the purchases of "penny" stocks: I'm pretty sure no brokers want to go through the hassle of (1) for each transaction. They may not want to do the other points either. Of course, YMMV, some brokers may continue to offer the purchases without enforcing (1), etc.
  11. Agreed that foreign license plate in Austin, TX is quite unlikely. Overall foreign license plates are very unlikely in US (compared to Europe, for example). Maybe few Canadian ones closer to Canadian border. It's possible though that this was a real incident. Just likely not a trafficking kidnapping but drunken guys trying to pick up (grab) a girl for (likely) rape. This is possible IMO given time and description. And then the details got embellished in the story. Edit: Like writser jokes, there are definitely places/countries where this could happen. So try to be safe. Though I doubt this forum reaches a lot of women who are at risk for such crime...
  12. Putin is basically trying to go after those who put sanctions on his cronies, and he knows that even if he doesn't get them, he's creating a huge chilling effect by trying to scare other officials who might think about going after the Russian Mafia by making it known that he's going to try to reach them even outside of Russia (as he has done repeatedly to Browder). Translation is rather crappy - I guess Google Translate? - but hopefully the meaning is somewhat understandable. Like Liberty says, Russians want to interrogate a number of Americans who they claim are connected to "Bill Browder case".
  13. I have to compliment you: you do hit the right spots. 8) I also thought that their data cleaning/preparation was suspect. So I ran on data without their "missing dates" transformation. I thought I'll get bad results with that and then the results will jump when I do their transformation, which would prove that their results are caused by bad data preparation. No cigar. I get the same bad results with original data and with data prepared based on their description. Which does not prove that their preparation wasn't broken... it might be broken in a way that's not described in the paper. Or maybe the description does not match what's in the code. Anyway, maybe this is enough time spent on this paper. Maybe the right thing to do is to wait if they gonna publish the code (or ask for the code). Or just conclude that their results are broken and we just don't know why. 8) I'd still be interested to discuss with someone my implementation and where it might be different from theirs. Just to see if I missed something obvious or did something wrong. But my code is hacked up mess, so it's probably not a high ROI for anyone to look at it. 8) I could put my code on github... oh noes, I'm too ashamed of the quality to do it... ::) Anyway, thanks for discussion so far. 8) Ah, regarding I write this off as academia. People may be more interested in results/papers/thesis (I think this was master's thesis for one author) than in applying it in real life. Almost nobody from the people I know transferred their thesis/papers into actual startup work. Maybe it's more common nowadays and in certain areas, but it's likely not very prevalent. I guess this area/paper would be easier to transfer into money making than other theses, but they still might not be interested. A valid question though. I'm not cynical enough to suggest that they know their results are broken and that's why they published instead of using them themselves. I somewhat believe people don't consciously publish incorrect results. But who knows. ::)
  14. I dont know what the authors did but ill reiterate from before vanilla LSTMs do little better than guess on the stock market. They probably had like 1000 GPU and tested thousands of hyperparameter configurations and "overfit" the test set. This is why typically papers like this are not believed anymore in the ML literature. Try adding some stuff like attention or skip connections and whatever else is hot now (I'm not sure) and didnt someone recommend GRUs instead. I have some other ideas you can use like Gaussian Processes to estimate realtime covariance matrices, but your better off looking at the literature first than trying out hairbrained ideas that might not work. It's really not a trivial excerse to outperform the market with ML. Ah, I think I see where there is a miscommunication between us. :) My goal is not to outperform market with ML. My goal is to understand whether what is proposed in this paper works and if it does not then why. 8) You are possibly completely right that what authors propose does not work. I just want to understand how they got the results they got. You've said "probably had like 1000 GPU and tested thousands of hyperparameter configurations and "overfit" the test set." before. I don't think that's the case at all. If you read the paper - which you haven't so far - you can see that their training is really simple and there's no "thousands of hyperparameter configurations". Which is baffling in itself. I have some suspicions of what could be wrong, but it's not productive to discuss it if you just dismiss the paper offhand. Which is BTW your prerogative - if that's where you stand, that's fine and I won't bother you with this further. 8) You are entirely correct that I haven't read the paper and maybe I was too hasty in dismissing the paper. I wouldn't mind a copy of the paper if you don't mind sending me one. That being said here is my reasoning in more depth. The authors seem like they are in ML acadamia, so I made a couple assumptions. 1.) It didnt look like their paper made it to one of the premier conferences and maybe its because they aren't big names but likely its because people have been training LSTMs on stocks for a long time and vanilla LSTMs dont work well and I think everyone in the ML community is suspicious of 80% hit rates using a vanilla LSTM on indices for good reason and they likely didn't do anything special to assume that they didn't just get "lucky" with their model. the reason they got "lucky" is number 2) typically papers dont discuss the hyperparameter search they go through to find the exact correct configuration, so even if they didn't say they tested 100s/1000s of hyperparameters they might have and likely did (although yes i didnt read the paper). Unless they specifically say there were few or no hyperparameters to test or they tested only a few of them, you should assume they did test many. This is a dirty secret in ML, you come up with a new technique and you dont stop testing hyperparameter choices the model until you get good results on both the test set and validation set. Then you submit to to a journal saying this method did really well because it outperformed on both the validation set and test set. But you stopped right after you get a hyperparameter choice that met those criteria which strongly bias your results upward. This is related to p-hacking. This is a perfectly natural, but bad thing people do and usually means most papers have performance that can't be matched when trying to reproduce them. You can pick basically any method of the thousands that have been proposed and if it doesn't have over 1000 citations (and the method actually seems useful) this is probably one reason why. Now you maybe you are right and something else may be missing, but if I had to guess I think its a good chance the authors just got "lucky". BTW why dont ask the authors for their code. Its customary to either give this stuff out or post it on github. As a side note: Even a vanilla LSTM has many hyperparameters: number of states, activation type, number of variables to predict, test/train/validation breakdown, number of epochs, choice of k in k fold validation, size of batches, random seed, how they intialized weights (glorot, random nomal, variance scaling..) for each weight in the ANN, the use of pca or other methods to whiten data, momentum hyperparameter for hillclimbing, learning rate initialization, choice of optimizer... My point is that even with a vanilla LSTM the author can pull more levers than can be hope to be reproduced if you don't know absolutely everything maybe even down to the version of python installed to reproduce the pseudorandom number generator. No doubt some of these choices will be mentioned in the paper, but many of these choices won't be typically, which makes any reproduction difficult. And typically the authors are the only ones who are incetivized to keep trying hyperparam configurations until one works. The real papers that are sucessful are typically methods where either its not impossible to get a reproducible and externally valid hyperparamter configuration, or something that is relatively robust to hyperaprameter choices. I sent you the link to the paper. If you look at table 1, there's couple things to notice: Yeah, for Adagrad, the accuracies are all over the place. But for Momentum and Rmsprop they are all quite similar and way higher than 50% (which would be random guess). So I think this somewhat shows that they did not just pick a single lucky combination of what you call hyperparams. You can still argue that perhaps there's a lucky hyperparam that is not shown in Table 1. That's possible, but I guess it's becoming less convincing. ;) OTOH, I did not run all the combinations they presented in Table 1, but from what I ran, the results were way more stable and clustered at 48-52% range. So I wonder why they are getting much wider dispersion than I do and why their results have so much better accuracy. So I wonder if their results are correct. In other words, you question their results because you think they hyperparam hacked. I question their results because I think there's another issue somewhere. But I don't know what it is. I think you're a bit overstating the instability of the runs. Yeah, there's definitely hyperparam hacking, but IMO - and I'm not a huge expert - the big difference comes from network architecture hacking rather than version of python, random seed, etc. Also I think you're mostly talking about papers/work where someone tries to squeeze out couple % gain on a widely studied problem where tons of methods have been applied in the past. I'd be more inclined to agree with you if these guys were at 53% accuracy in single or couple tests. But with the number of results in 70% range, I think there's something else going on. But since I don't know what it is, your argument might be still weightier than mine. ;)
  15. I dont know what the authors did but ill reiterate from before vanilla LSTMs do little better than guess on the stock market. They probably had like 1000 GPU and tested thousands of hyperparameter configurations and "overfit" the test set. This is why typically papers like this are not believed anymore in the ML literature. Try adding some stuff like attention or skip connections and whatever else is hot now (I'm not sure) and didnt someone recommend GRUs instead. I have some other ideas you can use like Gaussian Processes to estimate realtime covariance matrices, but your better off looking at the literature first than trying out hairbrained ideas that might not work. It's really not a trivial excerse to outperform the market with ML. Ah, I think I see where there is a miscommunication between us. :) My goal is not to outperform market with ML. My goal is to understand whether what is proposed in this paper works and if it does not then why. 8) You are possibly completely right that what authors propose does not work. I just want to understand how they got the results they got. You've said "probably had like 1000 GPU and tested thousands of hyperparameter configurations and "overfit" the test set." before. I don't think that's the case at all. If you read the paper - which you haven't so far - you can see that their training is really simple and there's no "thousands of hyperparameter configurations". Which is baffling in itself. I have some suspicions of what could be wrong, but it's not productive to discuss it if you just dismiss the paper offhand. Which is BTW your prerogative - if that's where you stand, that's fine and I won't bother you with this further. 8)
  16. The paper I was talking about is "Dow Jones Trading with Deep Learning: The Unreasonable Effectiveness of Recurrent Neural Networks" to be presented at http://insticc.org/node/TechnicalProgram/data/presentationDetails/69221 The paper is not publicly available, but you can ask the authors for copy. I have a copy and can send it to people interested, but I won't post it here publicly. PM me if you want a copy. Couple comments on various things previously mentioned now that the paper is semi-public: - The paper predicts daily close of DJIA from daily open value + opens of previous n days (2-10). - The trading algorithm is simply buy if predicted close > open and sell otherwise. If you cannot buy (already have position), then hold. If you cannot sell (already hold cash), then hold cash. - Authors use training data from 01/01/2000-06/30/2009 and test data from 07/01/2009 and 12/31/2017. This somewhat answers the critique that training is from bull market: it's not. Testing is not completely from bull market either. - Authors use pretty much vanilla LSTM, so IMO the critique that "1000s of academics looking for signals and the winners publish a paper" or that they have tweaked/over-fitted the model until it worked does not seem to apply. (It's possible that they messed up somehow and used testing data in training, but they seem to be careful, so it doesn't seem very likely). This is really vanilla IMO without much tweaking at all. Which makes the results surprising BTW. - I have some other comments, but I'd rather discuss this further with people who have read the paper, so I won't post them now. 8) As I mentioned, I spent a bunch of time reimplementing what these guys presumably implemented. I do not get their results. My results are pretty much at level of random guessing, i.e. the accuracy is around 48-52% while they get up to 80% accuracy. It's quite possible I am not doing something the same way they did. It's possible that their implementation or testing is messed up somehow too. But it's hard to prove that. Maybe they'll opensource their implementation sometime in the future. 8) If anyone is interested to get together (online through some tools?) and go through the paper and/or my implementation, we can do it. PM me and we'll try to figure out what would work best. 8)
  17. IMO it really depends on the person and the circumstances. There are people who do just fine by starting their own shop without working elsewhere. There are people who do fine starting their shop after working in corporate not-legendary offices. There are people who do great after working for legends. There are also people who don't do well in exact the same scenarios. I don't want to psychoanalize - so please forgive me - but it looks like you're looking for others to say "yeah, you have to work for legend first". Yeah, sure working for legend first would open a lot of doors. But you have to realize that percentage-wise the number of people who get to work for legends is very low. So it's up to you to try to get a position with legendary investor, but likely you won't. There's only one Tracy Britt Cool in the last 30 years+ (Todd and Ted don't count - they went to work for Buffett after being successful investors not before). So IMO you're severely handicapping yourself if you think that working for a legend is the only way to get a good career/results/etc. Edit: regarding spotting-future-Klarman - this is as hard as spotting-future-Klarman to invest with them. I think people already suggested to you to go to BRK/Fairfax/DJCO/BOMN/SYTE/whatever annual investor meetings and network with CoBF/etc/investor/hedge crowd there. This might give you opportunities. You might even corner Whitney Tilson or Mohnish Pabrai in one of these. Though I still think that discovering future Klarman ... and clicking with them enough so that they'll hire you ... is not going to be easy. Good luck.
  18. @racemize: Do you know/have you met Brian? Any stories/anecdotes?
  19. In Soviet Russia investing legends notice you.
  20. I wonder if the buyback announcement is related to Gates foundation. Perhaps BRK wants to buyback blocks that Gates foundation plans to sell.
  21. LOL, too late. WMT is already selling 3rd party electronics (at least, maybe other things too) that is as shady as Amazon marketplace.
  22. https://www.theguardian.com/global/2018/jul/08/generation-wealth-how-the-modern-world-fell-in-love-with-money My bling is blinger than yours! ;D
×
×
  • Create New...