This is that special blend of Tablet Kid “I don’t need to know things I can google them” and Rich Kid “I don’t need to do things I can crowdsource them” that makes for that Distinctively VP “I don’t know what I’m doing and nobody can tell 👈😎👉”
Is this fake?
For context, this is the guy who figured out how to see what’s written on some ancient Greek Scrolls without destroying them. It seems slightly far-fetched that he wouldn’t know better.
What Greek scrolls?
Ok so they were apparently in Greek but not from Greece. Source: https://news.unl.edu/article-2
Oh yeah the hosted DeepSeek has that
The secret to success in software engineering:
- Lie and say that there is
- Write or use a conversion algorithm
- Boss won’t know the difference
- Collect bonus at performance evaluation
- Put “AI engineer” on resume
Technically OCR is an application of machine learning.
Not an LLM, though.
A world of difference
regularizing the OCR’d form into a json/html file might be a good application of an LLM though. Perhaps this is what they were asking about.
I doubt they even know what they are asking about?
Like opening source code in Word.
I have to admit, PDF parsing being such a hot and profitable topic in computer science was really something I never saw coming.
PDFs? The things you can select text from? And when not, there’s decent OCR? And when not, you just ask the person to send you an email or a word doc?
It sounds like LLMs are looking for a new unpolluted source of historical data that they can learn from, and this source exists in the form of old scanned in paper documents. That’s the only reason I can fathom as to why this is such a big thing now.
Training the most insane AI model on classified federal documents.
Selecting text doesn’t work in most multi-column pdfs and good OCR cost money. And if the original source is lost and you want an exact copy in word, the OCR tools need to be really good at guessing whitespace-to-line ratio, because pdf is only an output format and not a processing format.
For most other converting needs, there’s pandoc, imagemagick and ffmpeg.
yes me send me what you want me to parse and i will get back to you in 3-4 business days
be sure to include the metadata too. lol
And processing fee!
the only fee i want is pics
Imagine getting a job like this and now half the nation knows your name…thats terrifying. being an intern may mean you have no idea of the true scope of what they are asking you to do.
Yeah, seems that’s the point. Old enough to competently perform what they’re told, but too young to realize the gravity of the situation and how wrong it is to partake in it.
that’s why we have 18 year soldiers …
It’s ok, with the experienced gained from being forced to grow up, some will come home and use their savings to buy a dodge ram on a 7 year loan at 18% apr.