Loading…
DevConf.CZ 2020 has ended
Debug / Tracing [clear filter]
Friday, January 24
 

5:00pm CET

Using Pbench to debug Performance Problems
The Performance Engineering team at Red Hat has been developing a tool and infrastructure called Pbench, which helps collect, in a complete and consistent manner, data about the execution of a benchmark, including the configuration data of the systems involved. We'll show you the power of having the configuration data along side arbitrary tool/metric data collected to make it easier to understand complex performance issues of distributed systems.

Speakers
avatar for Peter Portante

Peter Portante

Software Engineer, Perf Eng
Performance engineering working at Red Hat since 2011, focusing on software tools for analyzing performance problems with distributed systems.



Friday January 24, 2020 5:00pm - 5:55pm CET
D105 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia
 
Saturday, January 25
 

1:00pm CET

The history of a linux kernel design flaw
This talk about fixing long-standing bugs in the Linux kernel is based on a real story of a design flaw in the Linux kernel on x86-64 architecture. That design flaw existed since the beginning of Linux x86-64 support in 2001 and was finally fixed in 2019.

18 years ago, when the first Linux kernel with x86-64 architecture support was released, it was capable of running processes that execute native x86-64 CPU instructions, and processes that execute legacy x86 CPU instructions. This feature was very popular in the early years of x86-64 architecture when the amount of already existing legacy 32-bit binary code exceeded the amount of native 64-bit binary code.
The feature was implemented in a way that allowed native 64-bit processes to invoke both native 64-bit system calls and compat 32-bit system calls - a fact that was not widely known and caused quite a few surprises for many years to come.
At the same time Linux kernel provided no API that would allow user processes to determine in a reliable way whether the system call being invoked is a native 64-bit system call or a compat 32-bit one. For this reason debuggers and system call tracers traditionally decided on the bitness of syscalls by looking at the value of CS register that describes the bitness of processes.
While this approach works in most cases, it fails miserably when the assumption that system call bitness matches the process bitness is not valid.
This problem was reported many years ago. The first report I'm aware of is the bug #459820 reported in 2008 to the Debian bug tracker. That bug report contains an example program which invokes a 32-bit "fork" system call, deceiving strace to think it's a 64-bit "open" system call with weird arguments and return code.
The alternative method of obtaining system call bitness that is also known since the early years of x86-64 architecture is to fetch from the process memory and analyze the opcode of the CPU instruction that caused the system call invocation. This approach is also not reliable because reading process memory after the system call invocation is inherently racy, and in 2012 Linus Torvalds even produced a short example code that reliably deceives the tracer.
The fact that Linux kernel provided no API to obtain this crucial piece of system call information in a reliable way was recognized by Linux kernel developers as a problem many years ago. For example, in the beginning of 2012 there was a lively discussion on that matter. Many kernel hackers took part, many interesting ideas were discussed, but, unfortunately, there were no follow-up because none of these people were interested enough to implement a solution for the problem.
It took many years to find people who really care and were capable of delivering a fix. It was November 2018 when the first RFC patch to fix the problem by extending ptrace API was proposed by Elvira Khabirova. Shortly after the discussion that followed there was a second edition consisting of 16 patches, 15 of them were extending and fixing internal Linux kernel API on various architectures. As result of subsequent iterations the patchset grew further, it affected all architectures supported by the kernel and extended audit and ptrace subsystems. To get it accepted into the kernel, we had no other choice but to split it into parts and upstream them via appropriate maintainer trees. It was an amusing process full of pings.
To cut story short, it took almost 9 months to get all 29 patches implementing PTRACE_GET_SYSCALL_INFO API merged into the kernel, the last patch of the series was accepted in July 2019, the first Linux kernel release with this feature is 5.3.
PTRACE_GET_SYSCALL_INFO API is supported in strace starting with version 4.26 released in December 2018. strace performs a runtime check for PTRACE_GET_SYSCALL_INFO support in the kernel and automatically switches to use this API when it's available. Other userspace debuggers and tracers will follow.

Speakers
avatar for Dmitry Levin

Dmitry Levin

Chief Software Architect, BaseALT
Dmitry is the co-founder and the chief architect of BaseALT, a long time contributor to free software projects, including strace, Linux kernel, the GNU libc, Linux-PAM, and many others. Being the maintainer of strace since 2009, Dmitry gives talks about this tool for various audi... Read More →



Saturday January 25, 2020 1:00pm - 1:55pm CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia

2:00pm CET

strace: fight for performance
The talk gives an overview of various optimisations implemented in strace over the past several years. While most of them are quite trivial (like caching of frequently-used data or avoiding syscalls whenever possible), some of them are a bit more tricky (like usage of seccomp BPF programs for avoiding excessive ptrace stops) and/or target more specific use cases (like the infamous thread queueing patch[1], which had been carried as a RHEL downstream patch for almost 10 years).

[1] https://gitlab.com/strace/strace/commit/e0f0071b36215de8a592bf41ec007a794b550d45

Speakers
avatar for Eugene Syromiatnikov

Eugene Syromiatnikov

Senior Software Engineer, Red Hat
A strace developer. Used to work in an HPC-related field. Currently employed at Red Hat as a software engineer in the kernel maintainers team, responsible for producing Driver Updates, and maintenance of various RHEL packages, including strace and Intel CPU microcode updates.
avatar for Dmitry Levin

Dmitry Levin

Chief Software Architect, BaseALT
Dmitry is the co-founder and the chief architect of BaseALT, a long time contributor to free software projects, including strace, Linux kernel, the GNU libc, Linux-PAM, and many others. Being the maintainer of strace since 2009, Dmitry gives talks about this tool for various audi... Read More →



Saturday January 25, 2020 2:00pm - 2:25pm CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia

2:30pm CET

libperf: library for perf events monitoring
Introduction of the new library, that comes directly from kernel's
perf tool sources with interface to count and sample perf events.
I'll introduce and show the current interface on examples. I'll
also describe the planned functionality that will be ported from
perf tool in future.

Speakers
avatar for Jiri Olsa

Jiri Olsa

Software Engineer, Red Hat
Jiri works for RedHat full time on Linux as kernel generalist engineer in Brno office, Czech Republicech Republic. He currently divides his work time between upstream perf work and maintaining RHEL perf.



Saturday January 25, 2020 2:30pm - 2:55pm CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia

3:00pm CET

An introduction to bpftrace tracing language
Quite often, one encounters an issue where a tracing tool comes in quite handy. It could be for instance a performance issue or a bug with a application that you're using. Linux already offers a wide choice of tracing tools and chance are you've already used one. One of the latest addition in that area is bpftrace.

Bpftrace is high-level dynamic tracing language. It allows easy and safe, yet powerful tracing of existing program without the need to modify them. In this talk, I'll try to show you what can be done with bpftrace and explain why, when and how to use it.

Speakers
avatar for Jerome Marchand

Jerome Marchand

Kernel engineer at Red Hat
Kernel engineer at Red Hat



Saturday January 25, 2020 3:00pm - 3:25pm CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia

3:30pm CET

Using bpftrace with Performance Co-Pilot & Grafana
In this talk the audience will learn how to use bpftrace, Performance Co-Pilot (PCP) and Grafana to get live, on-demand system metrics in the browser.
Attendees will learn how to setup the required components and how to write bpftrace scripts to gather system internals.

Previously bpftrace scripts had to be run by SSHing into a server, executing them and interpreting the console output. With PCP and Grafana, we can have visualizations of bpftrace scripts, for example we can visualize the return value of the vfs_read kernel function (amount of bytes written) in a live heatmap, capture stack traces and display them as flame graphs, and trace network functions and display them in a table.

Speakers
avatar for Andreas Gerstmayr

Andreas Gerstmayr

Software Engineer, Red Hat
Andreas works as a Software Engineer at Red Hat. He's working on Performance Co-Pilot (PCP) and related projects like a Grafana plugin for PCP, eBPF/BCC and bpftrace exporters for PCP etc.



Saturday January 25, 2020 3:30pm - 3:55pm CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia

4:00pm CET

bpftrace internals
Ever wondered what actually happens when you use '@ = count()' in bpftrace?
What's happening when you use the hist or any other internal call? How many
bpf maps are used and how much you slow down the system? I'll describe the
bpftrace constructs and how they are translated to eBPF instructions. I'll also
introduce some of the new features like BTF support.

Speakers
avatar for Jiri Olsa

Jiri Olsa

Software Engineer, Red Hat
Jiri works for RedHat full time on Linux as kernel generalist engineer in Brno office, Czech Republicech Republic. He currently divides his work time between upstream perf work and maintaining RHEL perf.



Saturday January 25, 2020 4:00pm - 4:55pm CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia

5:00pm CET

Traceloop: Tracing containers syscalls using BPF
I will present traceloop, a tracing tool to trace system calls in cgroups or in containers using BPF and overwritable ring buffers.

Many people use the “strace” tool to synchronously trace system calls using ptrace. Traceloop similarly traces system calls but asynchronously in the background, using BPF and tracing per cgroup. I’ll show how it can be integrated with systemd and with Kubernetes via Inspektor Gadget.

Traceloop's traces are recorded in a fast, in-memory, overwritable ring buffer like a flight recorder. As opposed to “strace”, the tracing could be permanently enabled on systemd services or Kubernetes pods and inspected in case of a crash. This is like a always-on “strace in the past”.

Traceloop uses BPF through the gobpf library. Several new features have been added in gobpf for the needs of traceloop: support for overwritable ring buffers and swapping buffers when the userspace utility dumps the buffer.

https://github.com/kinvolk/traceloop

Speakers
avatar for Alban Crequy

Alban Crequy

Co-founder and Director of Kinvolk Labs, Kinvolk
Alban is Co-founder of Kinvolk and director of engineering for Kinvolk Labs. He has a particular interest in integrating BPF into Kubernetes. He’s a maintainer of the gobpf library and has worked on software in the cloud space using BPF with Golang: Weave Scope, Traceleft, Project... Read More →



Saturday January 25, 2020 5:00pm - 5:25pm CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia
 
Sunday, January 26
 

10:30am CET

Deterministic debugging with Delve
In this talk I will dig into how Delve can be utilized to perform deterministic debugging for Go. This style of debugging enables users to record the execution of their process and "play it back" in a deterministic fashion in order to more quickly and efficiently perform root cause analysis on a bug that may otherwise be difficult to reproduce or track down. First I will begin by introducing the concept of deterministic debugging and why it is so useful and powerful. This will include a high level overview of just what exactly "deterministic debugging" means and why it's an important tool for any developers toolbox. Along the way I will dig into some of the technical implementation details of deterministic debugging for those attendees who love to know how things work under the hood. Following that I will use a live demo to showcase how this style of debugging can be used on a real Go program.

Speakers
avatar for Derek Parker

Derek Parker

Senior Software Engineer at Red Hat, Red Hat
Senior Software Engineer at Red Hat



Sunday January 26, 2020 10:30am - 11:25am CET
E112 Faculty of Information Technology Brno University of Technology, Božetěchova, Brno-Královo Pole, Czechia
 
Filter sessions
Apply filters to sessions.