Liberty Posted May 5, 2020 Posted May 5, 2020 https://openai.com/blog/ai-and-efficiency/ I enjoyed this post that attempts to measure algorithmic efficiency improvements for ML training sets. Very impressive.
lnofeisone Posted May 5, 2020 Posted May 5, 2020 Thanks for sharing. This is not an easy problem and they definitely have an interesting take on things (that I'd call very directionally correct ;D). Few things I'd look for them to resolve/discuss: 1) AlexNet and VGG are serial networks and comparing them to ResNet (which is network-in-network architecture) is, in my view, like comparing vacuum tubes to transistors. 2) EfficientNets are super powerful but rely on the baseline network (really excelling with transfer learning) which really means they are specialized and can be challenging to apply broadly
Liberty Posted May 5, 2020 Author Posted May 5, 2020 1) AlexNet and VGG are serial networks and comparing them to ResNet (which is network-in-network architecture) is, in my view, like comparing vacuum tubes to transistors. Don't we compare vacuum tubes to transistor when we look at progress in computing over time?
lnofeisone Posted May 5, 2020 Posted May 5, 2020 1) AlexNet and VGG are serial networks and comparing them to ResNet (which is network-in-network architecture) is, in my view, like comparing vacuum tubes to transistors. Don't we compare vacuum tubes to transistor when we look at progress in computing over time? We certainly put them on the timeline to see computations per dollar spent but I've never seen them compared (Maybe if I search EE papers circa 20s-50s.). It's hard to compare the two due to technological discontinuity. That's the primary reason why Moore's law (rule of thumb) starts with the invention of a transistor. I'd argue going from Alexnet to ResNet-type of architectures is a similar discontinuity.
Liberty Posted May 5, 2020 Author Posted May 5, 2020 1) AlexNet and VGG are serial networks and comparing them to ResNet (which is network-in-network architecture) is, in my view, like comparing vacuum tubes to transistors. Don't we compare vacuum tubes to transistor when we look at progress in computing over time? We certainly put them on the timeline to see computations per dollar spent but I've never seen them compared (Maybe if I search EE papers circa 20s-50s.). It's hard to compare the two due to technological discontinuity. That's the primary reason why Moore's law (rule of thumb) starts with the invention of a transistor. I'd argue going from Alexnet to ResNet-type of architectures is a similar discontinuity. You can certainly compare them in computations per second or price per operation or whatever. Of course they're ancient so less relevant now, but I don't see why they can't be compared.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now