Running a Local LLM on AWS Without Losing Your Mind (Mostly)
If you’ve ever tried doing anything moderately unusual in AWS — like running a local LLM such as DeepSeek inside a secure EC2 instance — you probably know the pain I’m about to describe. What should have taken 30 minutes turned into a swirling hell of full disks, memory errors, buried menus, and UI madness.
AWS is a powerful platform, no question. But it feels like it was built by 300 different teams who never spoke to each other. The UI looks sleek on the surface, but every task takes twice as long, and every menu seems to require a treasure map.
Let me walk you through how I got DeepSeek-R1, a powerful local LLM, running inside an EC2 instance — without cloud calls, without GPT, without losing (too much of) my sanity.
Step 1: Spinning Up an EC2 Instance (and Already Regretting It)
I launched a basic EC2 instance. I figured, sure, start with something small — it’s just text inference, right? Wrong. The default storage size was tiny. The default memory was laughable. The UI to configure all this? A maze of modals, tabs, and dropdowns. Changing anything feels like filling out a form in SAP SuccessFactors.
AWS makes you jump through hoops just to get to a basic overview of your instance. There’s no clear, centralized dashboard where you can view and adjust everything from one place. You’re constantly being kicked from “EC2” to “Volumes” to “Instances” to “Snapshots” and back again. I had to reorient myself every 5 clicks.
Step 2: Running Out of Disk Space (Instantly)
As soon as I tried to pull the DeepSeek model using Ollama, the disk filled up and exploded in my face. AWS’s default root volumes are small, and DeepSeek is heavy. I got hit with a brutal “no space left on device” error before the model had even finished downloading.
Finding where to resize the volume was a challenge in itself. You don’t do this from the instance page — oh no. You have to click into the volume separately, open another panel, hit “Modify Volume,” enter a new size (I picked 32 GB), confirm three times, and hope you didn’t miss something.
Once that’s done, you still need to SSH into the instance and manually extend the partition and file system. That means installing tools, running terminal commands, and praying nothing fails. In 2025. In a cloud platform. Seriously?
Step 3: RAM Isn’t RAM Until You Change the Whole Machine
After finally fixing the disk space and downloading the model, I tried running it — only to hit the next wall: “not enough memory.”
Of course.
Unlike other platforms, AWS doesn’t let you just “add more RAM” to a machine. No sliders, no real-time scaling. Instead, you have to stop the instance, pick a new instance type (like t3.large for 8 GB or t3.xlarge for 16 GB), confirm the change, then start it again.
Oh, and don’t forget: you might lose your public IP unless you pinned it. And good luck finding that option if you’re new to the console.
It’s the kind of complexity that makes you feel like you’re doing something wrong, even when you’re not.
Step 4: Finally, It Works — But At What Cost
Once I had enough disk space and memory, Ollama finally ran the DeepSeek model without complaining. I was able to use it offline, inside a fully isolated environment, without any cloud dependencies. It handled coding queries, Linux help, text generation — and it felt incredibly satisfying.
But it took so much unnecessary effort to get there.
Conclusion: AWS Is Powerful, But Hostile
AWS gives you the building blocks, but absolutely none of the scaffolding. If you just want to “try something,” be ready to work through ten different consoles, install packages manually, resize volumes with CLI tools, and deal with cryptic error messages.
Would I recommend running local LLMs in AWS? Sure — if you have a reason. But if you want developer-friendly, intuitive cloud infrastructure? There are easier, saner options.
I got DeepSeek working. But not before AWS tried to convince me to quit a dozen times.