[OC] Reviving and Advancing Page Cache Attacks on Linux (My first publication as a PhD student!)

A Basil Plant@lemmy.world · 2 months ago

My website’s the one linked in this post: https://snee.la/

My email is at the contact page: https://snee.la/contact/

sneela [at] tugraz [dot] at

A Basil Plant@lemmy.world · 2 months ago

I’ll be sure to reach out if I find myself being unable to replicate it.

No worries, and good luck! My email can be found on my website if you want it :D

I wasn’t even talking about tikzplotlib. It’s just that pgf backend is now supported by matplotlib and you can produce pgf files with.

Ah… I’ve think I’ve heard of it, but I never really registered that. Thanks for the info :D

A Basil Plant@lemmy.world · 2 months ago

I could give you the tikz source of Fig 2 if you’d like. The patterns and colors of the plots took me almost a day to choose. I wanted to go for a color-blind friendly pallette and keep it looking still snazzy. (https://github.com/simon-pfahler/colorblind)

I’m familiar with matplotlib -> PGFplots (using the Python tikzplotlib library). Unfortunately, I’ve decided against using it for the paper as it produces quite unmanageable outputs. Especially if I rerun experiments + with new data, and later want to change patterns, colors… It was always more of a hassle. I used it for my Master’s thesis.

Instead, Python program -> show plot -> if okay, generate CSV.

In LaTeX, have PGFplot code which reads CSV file and generates the data that way. Much, much easier to maintain.

A Basil Plant@lemmy.world · 2 months ago

Thanks for your words!

Yes! We use TikZ for the diagrams, which can be a nightmare sometimes… but it gets better the more I use it.

Regarding the plots, we use PGFplots. I often use matplotlib for quick plots while running experiments, but the paper itself uses PGFplots with the data in a CSV for that sweet, sweet scaling when you zoom in.

A Basil Plant@lemmy.world · edit-2 2 months ago

[OC] Reviving and Advancing Page Cache Attacks on Linux (My first publication as a PhD student!)

A Basil Plant@lemmy.world · 5 months ago

If the reports are somewhat technical (written with Latex for example), check out sioyek: https://sioyek.info/. It’s a PDF reader mainly for academic use.

Sioyek has made reading and reviewing papers SO much easier and it’s really, really convenient… once you get the hang of it. It takes a bit of time to get used to all the things, but it’s worth it. I also review students’ theses with it. Highlighting colors and adding comments is super easy (select text, h+g (green highlight), type comment).

If you have want to export your notes and comments, you will need this script though: https://github.com/ahrm/sioyek/blob/main/scripts/embedded_annotations.py

A Basil Plant@lemmy.world · edit-2 1 year ago

~~I suggest using two different spellings:~~

~~Mold is the fungus. To mould is to shape.~~

Nvm I’m an idiot. Lol

A Basil Plant@lemmy.world · edit-2 1 year ago

That seems to be the consensus online. But thanks for that tidbit! It feels even more bizarre now knowing that.

I wonder why a handful of people think the way I presented in the post. Perhaps American/British influences in certain places? Reading books by british authors and books by american authors at the same time? Feels unlikely.

A Basil Plant@lemmy.world · edit-2 1 year ago

Do you use "mould" or "mold"?

A Basil Plant@lemmy.world · 2 years ago

Ah if you messed it up, you can press “e” on the grub entry and edit the command line parameters to remove the thing that messes it up. Good luck with your fresh install [and use Debian this time… jk :)]

A Basil Plant@lemmy.world · 2 years ago

Make sure to update your grub after you do. I’ve messed that one up before lol 😅

A Basil Plant@lemmy.world · 2 years ago

Do you not need the nvidia-drm.modeset=1 in GRUB_CMDLINE_LINUX?

https://www.if-not-true-then-false.com/2015/fedora-nvidia-guide/#262-edit-etcdefaultgrub

A Basil Plant@lemmy.world · 2 years ago

Could you show us the kernel command line parameters (in /etc/default/grub)? Is the modeset along with other params enabled? I’m not a fedora user, so I may not be of too much help.

A Basil Plant@lemmy.world · 2 years ago

https://duckduckgo.com/?q=mini+wheats+cereal+&t=fpas&iar=images&iax=images&ia=images

Mini Wheats?

A Basil Plant@lemmy.world · 2 years ago

Please post the source next time. I spent 2 minutes looking for it: https://chrisdallariva.substack.com/p/when-the-fck-did-we-start-singing

A Basil Plant@lemmy.world · edit-2 2 years ago

I’m glad you appreciate it! It’s always fun digging into kernel internals and learning new things :D

I’m also open to criticism about the writing if you have any.

A Basil Plant@lemmy.world · 2 years ago

How System Requests Work and How to Add Your Own SysReq

A Basil Plant@lemmy.world · 2 years ago

Thank you, I’ll send you an email within a day.

A Basil Plant@lemmy.world · 2 years ago

Would you consider sending it to Austria? I’d pay shipping charges (if it’s within reason lol). If you are, you can send me an email at: sneela-hwelemmy92fd [at] port87.com

A Basil Plant@lemmy.world · 2 years ago

Are you planning to scrap the CPU? I may be interested in it as I find faulty hardware fun to experiment on.

A Basil Plant@lemmy.world · edit-2 2 years ago

You haven’t given us much information about the CPU. That is very important when dealing with Machine Check Errors (MCEs).

I’ve done a bit of work with MCEs and AMD CPUs, so I’ll help with understanding what may be going wrong and what you probably can do.

I’ve done a bit of searching from the microcode & the Dell Wyse thin client that you mentioned. From what I can garner, are you using a Dell Wyse 5060 Thin Client with an AMD steppe Eagle GX-424 [1]? This is my assumption for the rest of this comment.

Machine Check Errors (MCEs) are hard to decipher find out without the right documentation. As far as I can tell from AMD’s Data Sheet for the G-Series [2], this CPU belongs to family 16H.

You have two MCEs in your image:

CPU Core 0, Bank 4: f600000000070f0f
CPU Core 1, Bank 1: b400000001020103

Now, you can attempt to decipher these with a tool I used some time ago, MCE-Ryzen-Decoder [4]; however, you may note that the name says Ryzen - this tool only decodes MCEs of Ryzen architectures. However, MCE designs may not change much between families, but I wouldn’t bank (pun not intended) on it because it seems that the G-Series are an embedded SOC compared to the Ryzen CPUs which are not. However, I gave it a shot and the tool spit out that you may have an issue in:

$ python3 run.py 04 f600000000070f0f
Bank: Read-As-Zero (RAZ)
Error:  ( 0x7)

$ python3 run.py 01 b400000001020103
Bank: Instruction Fetch Unit (IF)
Error: IC Full Tag Parity Error (TagParity 0x2)

Wouldn’t bank (pun intended this time) on it though.

What you can do is to go through the AMD Family 16H’s BIOS and Kernel Developer Guide [3] (Section 2.16.1.5 Error Code). From Section 2.16.1.1 Machine Check Registers, it looks like Bank 01 corresponds to the IC (Instruction Cache) and Bank 04 corresponds to the NB (Northbridge). This means that the CPU found issues in the NB in core 0 and the IC in core 1. You can go even further and check what those exact codes decipher to, but I wouldn’t put in that much effort - there’s not much you can do with that info (maybe the NB, but… too much effort). There are some MSRs that you can read out that correspond to errors of these banks (from Table 86: Registers Commonly Used for Diagnosis), but like I said, there’s not much you can do with this info anyway.

Okay, now that the boring part is over (it was fun for me), what can you do? It looks like the CPU is a quad core CPU. I take it to mean that it’s 4 cores * 2 SMT threads. If you have access to the linux command line parameters [5], say via GRUB for example, I would try to isolate the two faulty cores we see here: core 0 and core 1. Add isolcpus=0,1 to see the kernel boots. There’s a good chance that we see only two CPU cores failing, but others may also be faulty but the errors weren’t spit out. It’s worth a shot, but it may not work.

Alternatively, you can tell the kernel to disable MCE checks entirely and continue executing; this can be done with the mce=off command line parameter [6] . Beware that this means that you’re now willingly running code on a CPU with two cores that have been shown to be faulty (so far). isolcpus will make sure that the kernel doesn’t execute any “user” code on those cores unless asked to (via taskset for example)

Apart from this, like others have pointed out, the red dots on the screen aren’t a great sign. Maybe you can individually replace defective parts, or maybe you have to buy a new machine entirely. What I told you with this comment is to check whether your CPU still works with 2 SMT threads faulty.

Good luck and I hope you fix your server 🤞.

[1] https://www.dell.com/support/manuals/en-us/wyse-5060-thin-client/5060_wie10_ug/system-specifications?guid=guid-cbeecec5-25ac-4103-8b4b-7d3a975e91f0&lang=en-us

[2] https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/datasheets/52259_KB_G-Series_Product_Data_Sheet.pdf

[3] https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/52740_16h_Models_30h-3Fh_BKDG.pdf

[4] https://github.com/DimitriFourny/MCE-Ryzen-Decoder

[5] https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html

[6] https://elixir.bootlin.com/linux/v6.9.2/source/Documentation/arch/x86/x86_64/boot-options.rst

A Basil Plant@lemmy.world · edit-2 2 years ago

The debug version you compile doesn’t affect the code; it just stores more information about symbols. The whole shtick about the debugger replacing instructions with INT3 still happens.

You can validate that the code isn’t affected yourself by running objdump on two binaries, one compiled with debug symbols and one without. Otherwise if you’re lazy (like me 😄):

https://stackoverflow.com/a/8676610

And for completeness: https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Debugging-Options.html

A Basil Plant@lemmy.world · edit-2 2 years ago

Excellent question!

Before replacing the instruction with INT 3, the debugger keeps a note of what instruction was at that point in the code. When the CPU encounters INT 3, it hands control to the debugger.

When the debugging operations are done, the debugger replaces the INT 3 with the original instruction and makes the instruction pointer go back one step, thereby ensuring that the original instruction is executed.

A Basil Plant@lemmy.world · edit-2 2 years ago

Slightly less than two drinks = positive effect on programming ability

A Basil Plant@lemmy.world · 2 years ago

A Basil Plant

[OC] Reviving and Advancing Page Cache Attacks on Linux (My first publication as a PhD student!)

[OC] Reviving and Advancing Page Cache Attacks on Linux (My first publication as a PhD student!)

Do you use "mould" or "mold"?

Do you use "mould" or "mold"?

How System Requests Work and How to Add Your Own SysReq

How System Requests Work and How to Add Your Own SysReq

Slightly less than two drinks = positive effect on programming ability

Slightly less than two drinks = positive effect on programming ability

Firefox and XPI Files

Firefox and XPI Files