Hacker News Clone new | comments | show | ask | jobs | submit | github repologin
Ask HN: Recommendation for a SWE looking to get up to speed with latest on AI
99 points by Rizu 2 hours ago | hide | past | web | 38 comments | favorite





The poster's looking for articles, so this recommendation's a bit off the mark. I learned more from participating in a few Kaggle competitions (https://www.kaggle.com/competitions) than I did from reading about AI. Many folks in the community shared their homework, and by learning how to follow their explanations I developed a much more intuitive understanding of the technology. The first competition had a steep learning curve. I felt it was worth it. The application of having a specific goal and the provided datasets made the problem space more tractable.

Out of sheer curiosity, how much time did you spend on it on average? How much of this knowledge are you using now?

Not the poster you responded to but I learned quite a bit from kaggle too.

I started from scratch, spent 2-4 hrs per day for 6 months & won a silver in a kaggle NLP competition. Now I use some of it now but not all of it. More than that, I'm quite comfortable with models, understand the costs/benefits/implications etc. I started with Andrew Ng's intro courses, did a bit of fastai, did Karpathy's Zero to Hero fully, all of Kaggle's courses & a few other such things. Kagglers share excellent notebooks and I found them v helpful. Overall I highly recommend this route of learning.


I was playing also on kaggle a few years back, similar feedback.

Thanks for the detailed reply!

New short course on FreeCodeCamp YouTube channel looks good -

Ollama Course – Build AI Apps Locally https://youtu.be/GWB9ApTPTv4?feature=shared

As an aside, does anyone have any ideas about this: there should be an app like an 'auto-RAG' that scrapes RSS feeds and URLs, in addition to ingesting docs, text and content in the normal RAG way. Then you could build AI chat-enabled knowledge resources around specific subjects. Autogenerated summaries and dashboards would provide useful overviews.

Perhaps this already exists?


I don't think it's a good idea to kepp up to date at a daily/weekly cadence, unless you somehow directly get paid for it. It's like checking stocks daily, it doesn't lead to good investment decisions.

It's better to do it more batchy, like once every 6-12 months or so.


How do you do that? Once you're out of the loop for half a year, it becomes harder to know what's important and what's not, I think.

Every release is novel. Once something has been around for a while and is still being referenced, you know it’s worth learning.

Waiting 3-6 months to take a deep dive is a good pattern to prevent investing your time in dead-end routes.




What a goldmine of recommendations. I like Sam Witterveen’s YouTube stuff for keeping up to speed https://m.youtube.com/@samwitteveenai

I recently wrote a post for a coworker who asked the exact same question.

https://dandavis.dev/llm-knowledge-dump.html


The best place for the latest information isn't tech blogs in my opinion. It's the stable diffusion and local llama subreddits. If you are looking to learn about everything on a fundamental level you need to check out Andrej Karpathy on YouTube. There other some other notable mentions in other people's comments.

Lots of people can get impressive demos up and running, but if you want to run AI products in production, you're going to have to do system evals. System evals make sure your product is doing what it says on the box with unquantifiable qualities.

We wrote a zine on system evals without jargon: https://forestfriends.tech

Eugene Yan has written extensively on it https://eugeneyan.com/writing/evals/

Hamel has as well. https://hamel.dev/blog/posts/evals/


Machine Learning Mastery (https://machinelearningmastery.com) provides code examples for many of the popular models. For me, seeing and writing code has been helpful in understanding how things work and makes it easier to put new developments in context.

Simon's blog is fragmented because it's, well, a blog. It would be hard to find a better source to "keep updated on things AI" though. He does do longer summary articles sometimes, but mostly he's keeping up with things in real time. The search and tagging systems on his blog work well, too. I suggest you stick his RSS feed in your feed reader, and follow along that way.

Swyx also has a lot of stuff keeping up to date at https://www.latent.space/, including the Latent Space podcast, although tbh I haven't listened to more than one or two episodes.


thanks! i also have a daily news recap here https://buttondown.email/ainews/archive/


daveshap quit ai right? got agi pilled

He was only gone for a few days, IIRC. At any rate, he's back publishing AI related content again, and it looks like all (?) of his old content is back on his YT channel.

As I was building up my understanding/intuition for the internals of transformers + attention, I found 3Blue1Brown's series of videos (specifically on attention) to be super helpful.

This has been good for me, but it is more foundation than what is the latest. https://www.mattprd.com/p/openai-cofounder-27-papers-read-kn...

The localllama subreddit, although focused mostly on open source locally run models, still has ample discussion of SOTA models too.

https://old.reddit.com/r/LocalLLaMA/



Unwind AI would be helpful. They publish daily newsletters on AI as well as tutorials on building apps with step-by-step walkthrough. Super focused on developers. https://www.theunwindai.com/

Build a tool on top of the LLM layer for a specific use case. That'll get you up to speed. You haven't missed much.

Exactly. Avoid intentionally throw-away effort and instead attempt to build something specific and practical. Learn by doing.

Reproduce nanogpt.

Then find a small dataset and see if you can start getting close to some of the reported benchmark numbers with similar architectures.


Being a coder, I find these resources extremely useful:

Github blog: https://github.blog/ai-and-ml/ Cursor blog: https://www.cursor.com/blog


My blog is very high volume so yeah, it can be difficult to know where to look on it.

I use tags a lot - these ones might be more useful for you:

https://simonwillison.net/tags/prompt-engineering/ - collects notes on prompting techniques

https://simonwillison.net/tags/llms/ - everything relating to LLMs

https://simonwillison.net/tags/openai/ and https://simonwillison.net/tags/anthropic/ and https://simonwillison.net/tags/gemini/ and https://simonwillison.net/tags/llama/ and https://simonwillison.net/tags/mistral/ - I have tags for each of the major model families and vendors

Every six months or so I write something (often derived from a conference talk) that's more of a "catch up with the latest developments" post - a few of those:

- Stuff we figured out about AI in 2023 - https://simonwillison.net/2023/Dec/31/ai-in-2023/ - I will probably do one of those for 2024 next month

- Imitation Intelligence, my keynote for PyCon US 2024 - https://simonwillison.net/2024/Jul/14/pycon/ from July this year


Subscribe to The Neuron newsletter


Get on Twitter (well, X) as that's where the the cutting edge is.

Read through this making flashcards as you to: https://eugeneyan.com/writing/llm-patterns/

Then spin up a RAG-enhanced chatbot using pgvector on your favourite subject, and keep improving it when you learn about cool techniques


Are you wanting to get into LLMs in particular or something else? I am a software engineer also trying to make headways into so-called "AI", but I have little interest in LLMs. For one, it's suffering from a major hype bubble right now. The second reason is that because of reason one, it has a huge amount of attention from people who study and work on this every day. It's not something I have the time commitment for to compete with that. Lastly, as mentioned, I have no interest in it and my understanding of them leads me to believe they have few interesting applications besides generating a huge amount of noise in society and dumping heat. The Internet, like blogs, articles, and even YouTube, are already being overrun by LLM-generated material that is effectively worthless. I'm not sure of the net positive for LLMs.

For me personally, I prefer to work backwards and then forwards. What I mean by that is that I want to understand the basics and fundamentals first. So, I'm, slowly, trying to bone up on my statistics, probability, and information theory and have targeted machine learning books that also take a fundamental approach. There's no end to books in this realm for neural networks, machine learning, etc., so it's hard to recommend beyond what I've just picked, and I'm just getting started anyway.

If you can get your employer to pay for it, MIT xPRO has courses on machine learning (https://xpro.mit.edu/programs/program-v1:xPRO+MLx/ and https://xpro.mit.edu/courses/course-v1:xPRO+GenAI/). These will likely give a pretty up to date overview of the technologies.


checkout ollama. it lets you run open models on your own hardware. it also provides an easy to use rest api similar to openai's

Unpopular opinion: if you can't use Google nor ChatGPT to get an answer to this question, I have bad news for you.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: