Over the past years we have rebuilt the sales platform of Deutsche Bahn with over 100 microservices, 400 people in over 30 teams and thereby (re-)created one of the biggest e-commerce platforms in Germany.
During this process we felt increasingly like Marty McFly because in order to get early user feedback we began by integrating the “Elektronische Platzbuchungsanlage” (EPA), a software system that went into production in 1983(!). With system integration being a core activity in enterprise software engineering and architecture, and legacy systems in particular being a common occurrence in the everyday work of many software engineers we would like to share some in-the-trenches experiences of the problems we faced. Let’s embark on a journey through the past beginning with poorly accessible documentation and experts threatened by extinction, over generational leaps in technology for example regarding interfaces, unclear non-functional requirements up to mismatching concepts with respect to staging and testing and the pitfalls of distributed systems and transactions. We will reflect on the difficult and painful solutions and discoveries we made and smarter ways of doing a legacy integration like this.
Ultimately, we are all building legacy systems of tomorrow, so let us share our thoughts about “future-proof” software, documentation and organizations. After all: “The only real mistake is the one from which we learn nothing.”
Uh let’s get started um last session of the day thanks everyone for you know hanging in there until the very end although I suspect that maybe uh the free beer they uh offered might have something to do with that um so this session is going to be about two
Questions really uh number one is um how do you best go about integrating Legacy software um that’s something that’s been you know coming up every now and again to me over the last few years and obviously the one in uh reference in the title is what really gave the
Inspiration for this one and the second one is um based on that and looking at past Legacy Integrations what can we learn uh on how to uh come up with you know future proof software architecture future proof um software systems and before we get into it uh I have to do the tradition
The speaker introduces thems slide I’d like to keep this one short though because it’s admittedly uh it’s a bit awkward this time so I work for deutan and I suspect maybe a few of you are impacted by the strike as well I am too we’re all in the same thing together
Here I Came In by train last night I intended to leave maybe tomorrow morning I won’t but you know the way I see it really there’s worse places you can end up stranded than zorich right so let’s try and make the best of it okay so um let’s get to it um I’m
Usually not a big fan of these kind of you know history lesson slides but since we’re talking about a time frame of 40 years I think I’d be willing to make an exception um so this all started in uh well basically 1980 when the Executive Board of deutan decided that they wanted
To take um our booking system into the well I guess you could say digital age and they deviced uh a plan of having uh the entire booking process uh as a digital digital process by uh 1990 and one of the first systems they built to realize that plan is the one
That gave the title it’s called EPA it’s short for it’s such a beautifully German word brings a tear to my eye every time uh I’m sure the native speakers can uh appreciate it and it went into production um in 198 3 and if we skip ahead to the very end of the timeline
You can see I believe the current plan is to finally lay it to rest in 2026 or maybe 2025 I’m not quite sure and then by then it’ll be you know 43 years old actually and to keep things as short as possible um by the time in 2001 so
That’s like when the original sales system had been like 10 years old the new executive board is says that you know it was deemed aged and with grave technical risks and that includes EPA and there have been two entire renewals of the sales system after that um the
Most recent one I’ve been involved with um yeah so and by the way 1991 when the original one was finished uh that’s the year I was born and in 2023 um in I think fall last year we actually took the most recent one into production and
So um what we had to do was when we started out in 2017 2018 was that our idea was for the new sales system we wanted to go into production as early as possible to get you know early feedback the way uh people suggest you should um does my hair look good Demetrius
Um now idea was that we wanted to go into production as early as possible like ultimately our goal was to you know replace EPA with uh so what it does is it basically translates to it’s a seat reservation system right and ultimately our goal was to replace it with a new
Seat reservation system but we couldn’t do that because we quickly realized it would be a difficult one to replace and so we started out by integrating the main features we needed because to sell tickets we would need to be able to you know sell seats as well to our customers and ultimately interestingly
What was the downfall or the last nail in the coffin for EO is going to be is you know um it being 40 years old is uh you know remarkable in and by itself but it would probably be able to keep on running after that like it’s not that
It’s failing it’s not that it doesn’t work anymore obviously it’s being hard to maintain but actually what’s going to kill it is that the runtime is no longer being supported so it’s It Was Written in cobal and it runs on an HP non-stop host system and uh the point has come
Where you know even HP I mean you can imagine the despiration that the sales department of HP would say no matter how much money you bring we can’t seriously keep supporting this any longer and they basically set a deadline and so you know finally there’s no way around it anymore
We’re going to have to replace it but first you know we need to integrate some of the core functionality and that’s what what we’re going to look at um with regards to the first question how do you go about integrating Legacy software right and the first step this is going
To be painful for my fellow nerds here but in my opinion what you really got to do is you got to talk to people um like you need to talk to the people who build it who maintain it and who run it and in this case it actually turned out like
Obviously there’s like an operations team but at this point after 40 years uh people who were originally involved with this they were mostly retired and some of them even deceased it’s not exaggerated so this was actually a bit tricky and so the next thing you do is obviously you look at the documentation
And at this point I was prepared that they would tell me to you know go down to the library and look at the written paper but we were delighted to find that at at some point they decided to move the documentation into a digital uh format as well and they went for plain
HTML and CSS and we were like great you know it’s amazing they came up with documentation that’s probably like 20 years old and you can still access it today I mean um that’s great uh you know nowadays most people use for example I don’t know Confluence is Confluence
Going to be you know around still in 20 40 years I have no idea but I have my doubts anyway our joy was shortlived because Unfortunately they decided that they wanted to make the documentation fancy and to have a fancy menu and the fancy layout they did active X and if
You know you any if you use any modern day browser they just flat out refuse to open it and say you know it’s a huge security risk you can’t do it and so there’s actually documentation on how to make the documentation accessible but still it’s a it’s a huge pain um and
Then the last resort and that one’s come actually come up with all the Legacy I’ve been working on you have to find the one guy who’s sort of The Mastermind behind everything and I’m actually quite proud of this one because if you knew the guy like the actual guy it’s a very
Accurate picture like even down to you know hairstyle beard clothing everything it’s that’s basically him um and so he’s the kind of guy who joined deuts B like 40 years ago and he was there when they initially started working on EPA and he’s been around for 40 years with the
Same company and he he’s the guy you need to you need to ask to explain everything because pretty much everybody else treated this system EPA as sort of a blackbox you know it mostly did what you would expect it to do um but he could often times explain why and how it
Does things and so you know we started essentially spending a few days just picking his brain and uh I believe the first thing uh he said to us uh I can remember quite clearly actually was that he said something along the lines of so you the new guys and we were like what
Do you mean and he said well there’s been at least two serious attempts at replacing the system and they both failed phenomenally and we were like you know how hard can it be and we were about to find out so um obviously when you talk when you think about how do we
Go into about integrating it you need to look at what are the you know core features that this system offers and so we asked them explain to us what EPA can do and what part of that is still relevant to you know a modern day system where you’re selling tickets on trains
And he said well the first one is not going to be very surprising you can do seat reservations on trains okay that’s truly not surprising that involves you know uh making reservations um blocking seats uh you can cancel reservations and free them up again and there’s some special things for example we have
Something called graphical seat display um those of you uh whove booked tickets on an airline it’s like you get a graphical representation of what the train looks like and you can decide you know what seat exactly you want to sit on and so on and so forth okay then he
Said also um there are some Journeys where our customers travel not only by train but also by bus or even by ferry and uh with EPA you can even make reservations on those and that’s a bit of a stretch but you know if you accept that they might have numbered seats like
They probably don’t have coaches the way trains do uh they’re still uh numbered seats you can make reservations on them actually fun fact um uh there’s a route from emden a town in northern Germany to the aisle of bokum where you go by ferry and it’s mandatory to have a seat
Reservation and so you know uh we need to have that as well and then he went on and said and you can also make reservations on car parks and that’s where you know things start to get a little strange I guess and then finally and this one’s going to be fun he said
Oh it can also do everything related to pricing vouchers and billing yeah and that’s interesting you know some of you made pretty much the same expression I made when I first heard that the more experienced people are going to go oh no that’s actually bad thing and why is that because you
Know anyone who’s ever worked on anything remotely e-commerce knows that everything related to you know pricing billing and such should really be part of a separate domain a separate system and uh um I guess the only real explanation for this is that you know over 40 years people have been kept on
Piling on features and that’s really I don’t know it’s the it’s the mother it’s the grandmother of feature creep right um yeah and that’s obviously going to be trouble because if you uh use EPA interfaces and tell it you know make a reservation for me then it’s going to be
Okay please pay me when in reality for our new sales system we obviously have our separate domain for handling payment and stuff so yeah that’s going to be a pain now um when it comes to doing integration of software systems from an architecture point of view especially let’s say Enterprise architecture is
Quite simple right we have two systems so we have our new sales platform it’s called movas um it’s short for mobility and vending Services platform I believe and then we have the reservation system EA right for Enterprise architecture if you want to integrate both it’s quite
Easy what you have to do is you draw a line and you’re done now we all know that’s a bit of an oversimplification and so uh one of the actually the one of the very best decisions uh we ever made in the entire project was that um when you have in
Mind that we’re doing this as sort of a temporary thing remember our ultimate goal was to replace it and so it might be a good idea to use a very simple but very powerful pattern called adapter right you put all the stuff you need for the EPA specifics into a separate component the
Adapter um and then this way when you want to finally replace EPA with uh a new seat reservation system you just make move us point to the new adapter and uh in turn the new reservation system and then you know from the perspective of our new sales system all
The apis all the interface everything it’s going to be stable and you just have to replace uh the connection to the new adapter and then you will be fine and I know it’s a very simple pattern but I can tell you this was one of the best decisions we ever made really it’s
Just that um with something so big you need to be very very careful um to have to contain all the you know specific stuff of in this case EPA and really contain it in your adapter because over time the longer you go on the more features you uh integrate it’s going to
To be very hard because this stuff has a tendency of you know creeping beyond the adapter into the system you’re trying to protect so you have to be very much on the what uh on the lookout and try very hard to avoid that and over the years
It’s it’s very tough to actually keep up to that now aside from this obviously being a very uh simplified view there’s also something else you need to consider and that’s uh that in this case there’s actually um Two Worlds that Collide in terms of you know technology I mean on
The one hand we have our 40-year-old system written in cobal that runs on an HP non-stop host system and then on the other hand we have our new shiny sales platform which comprises of I don’t know 100 plus microservices in the cloud deployed through a cicd pipeline we can
Do all kinds of fancy stuff you know we can scale up scale down it’s uh developed in a test driven Manner and such and so um obviously the um technological capabilities are quite different and therefore you would be well advised to take a look at uh in particular the non-functional
Requirements and that’s what you should do next if you do um integration uh Legacy integration like that and the first one that comes to mind is let’s take a look at something such as Peak load like how much load can you handle how many requests per seconds in because
We’re web based and so you know uh we sat down with Clemens and the guys and we asked them what does it look like how much Peak load can you handle and what they said was well we can’t really give you exact numbers but we can tell you
That if we are in fact overloaded we don’t really handle that too well okay right and then the next one is what about response times like what are we talking about here and they said oh that’s a great one usually we’re going to respond within 0.5 5 seconds and
That’s actually I think respectable value um considering that you know if you knew the details actually there’s some heavy lifting involved in some cases and so I think 0.5 seconds is actually decent and then they continued and said but situationally it can be up to 32
Seconds and I mean at this point like how do you like how do you for example what would you consider to be a good timeout to do for this like should you be doing one second 5 10 15 no user is really going to be uh continuously
Willing to wait 32 seconds so that’s really weird but that’s a good explanation for it actually what’s going on in the background is um that uh you can make reservations through our systems and book tickets not only for German trains for example um uh consider this you make a booking
From say fryborg to Manheim that’s two German cities um and for the entire Journey you’re only going through Germany like there’s no borders you cross so why would this for example go to uh a reservation system from a different country because uh there are trains that serve this journey and they
Started here in Zurich and so now you would assume okay then probably it’s going to the SBB system but actually it’s going to ubb uh the Austrian uh because it’s a nightjet train right and you don’t know that because you made a booking on you know you have two your
Start Point your end point is German cities you don’t cross any borders but still for IA to make the reservation it needs to invoke uh the reservation system of the foreign country and that’s where things get tricky because some of our neighboring countries are known to
Be slow and that’s where you know the delay in response time comes from we’re working on that I’ll get to that um but still at least that’s an explanation and then the third one um when you look at like typical ones for nonfunctional requirements you know authentication and
Security I was prepared for the worst year but then they said okay we can’t do o or 2 well that’s not surprising because that’s you know uh comparatively modern technology but we can do TLS D SSL and maybe like me you would be a bit surprised and say you know obviously you
Can do SSL this has been around for as long as the internet has I looked it up actually you know what you would consider General availability of SSL has only been around since 2000 whereas EPA has been in production since 1983 so uh to them it had been conscious effort to
Actually introduce SSL so yeah I guess we were happy to have that at least now um about request load and overload um as they mentioned earlier they don’t handle overload too well and as a matter of fact they don’t really have any mechanisms in place to um you
Know Shield heal themselves from systems who are authorized to communicate with it and basically it was up to us to uh prevent overloading them and so you know the first thing we had to do before we were even considering you know doing uh an integration in production was to put
In place some sort of mechanism that no matter what our users did because you know if you’re doing uh if you’re in the cloud and you can autoscale potentially our platform could probably output something like 5,000 requests per seconds or even more I don’t know and so
We need to put something into place to protect EPA from being overloaded um and so we came up with like usually in today’s time you would probably have something like a service mesh to handle that but obviously like I’m not aware of any service mesh that
Would support HP non-stop if any of you do please let me know and so uh we used engine X for that to do the load limitation and um the way it looks is like this so we have uh our adapter uh which handles you know the translation from let’s say whatever it
Is our sales platform speaks to EPA and then we put in between the engine X as a load limitation device and whenever you know it detects that there’s too many requests per second coming in it’ll just deny those and say you know HTTP 429 too many requests and then that’s it now the
Only challenge is how do you get to have uh like where how do you choose your limits right they couldn’t tell us specifically and so they we had no other option but to actually test it by doing load testing and stress testing and unfortunately um because the uh runtime
Has become so uh I guess uh rare the only system they have in place for doing production like testing is production right they only have this very one like there’s another instance but it’s it can’t be in any way compareed to production like it’s way less powerful
And anything so the only thing we had left was to test in production and to find the limits you need to sometimes exceeded the limits and so you know uh sometime like in the middle of the night when people are unlikely to be doing you know bookings and reservations we
Actually killed it a few times to find some good limits yeah um so that’s still a problem to this day actually last day uh we had a situation where you know something unexpected came up and still um it was overloaded and then we have to take counter
Measures all right and the way it looks is like this um so when you get to the point where it gets overloaded it’s not like it’ll just you know continuously keep you know failing uh bit by bit but rather it’s like u a vertical bar like it goes from I answer
All your requests to I refuse to answer any at all and so that’s why you have to be very we have to be kind of you know defensive with the limits we chose because it’s better in our case it’s better to be safe and uh you know refuse
A few to too many requests then to let a few too many through and then overstress it because it will take a time for it to recover actually now um another thing regarding Peak load handling is what we always consider to be our worst case scenario
When it comes to load is that two things coincide uh this has actually happened in the past that there’s you know a strong storm that you know causes you know I don’t know uh trees fall down on the tracks and stuff and at the same time you have people on strike and what
That results in is you have like 10 thousands of people stranded at the train stations and they’re all you know frantically typing on their phones and want to know if any of their train is going to uh run or not and this is known to cause well in certain
Situations over and more than 2,000 requests per second and even more than that now even for a modern system 2,000 request per seconds is quite a lot I’m sure that if I approached any of you and asked you how would you design a system that handles 2,000 requests per seconds
You would take a minute to breathe and your likely response would be the same as us the only way realistically you could handle it is you need to introduce some sort of caching mechanism because if you don’t have that um if you only have the load limitation device and you
Have it to something like let’s say for example 50 requests per seconds and you’re getting 2,000 requests per seconds and something like 90% of your users are just going to be shut out with HTTP 429 and so what we had to do was to uh introduce an additional component that caches all those
Requests right um one more note on uh and so uh I guess you know this works but now all of a sudden uh it’s known there famous quote there’s only two hard things in computer science one is naming things and the second is how do you do
Cash and validation and so now we have to solve cash and validation as well great um an interesting side note uh the way we did it with engine X I don’t know if you’ve experienced with this tool or not um it’s actually uh something we learned the hard way uh if you Google
Something called uh the top 10 configuration mistakes to avoid in engine X it’s number three it’s that um by default it does not use uh persistent or keep alive connections and as a fun fact uh if you ever uh I don’t know if you knew that I didn’t so
The way the TLs protocol works is uh every time you establish a connection there’s a handshake phrase uh phase where I think three uh packages are sent and this one is compared to you know handling actual requests that go through the connection this one actually generates a certain amount of you know
Computational power it takes uh to do the initial handshake phase and in the times we did the stress tests um the EPA guys always told us you know the the handshakes are what’s killing us and so you actually have to make engine x uh use persistent connections and you can
Tell it you know how how large should the pool be you know how many requests per connection before you shut it down how long before it idles before you shut it down and so on and so forth because uh another fun fact EPA doesn’t do uh multi-threading so you actually have to
Be careful to not open more connections than they have processors because then uh you know the um the parallel the parallelizing you’re doing is actually doing more harm than good so there’s a lot of technical details anyhow uh one more problem we have to look at when it comes to nonfunctional requirements is
That uh we had to think about uh solving distributed transactions so um the problem here is the uh the right operations so when we do um actually uh tell EPA to you know um block certain seats like make a reservation or free them up because we’re canceling things
The scenario is this um so uh our user puts a certain ticket with his seat reservations into a shopping cart and we want to guarantee that you know those seats are reserved for him for a while and then what we do is we make the reservation in EPA and so the seats are
Blocked and now if this user decides to I don’t know go for a shower leave his uh leave his laptop for while and after a while we need to free those up for other people to make the booking and now obviously what can happen is that uh we
Make the cancellation request to EPA and for some reason we don’t get a response this can happen in distributed systems for reasons I don’t know there’s plenty you know um maybe they did respond we just didn’t get it maybe they didn’t get the request um maybe we our time R was a
Bit too short I don’t know plenty of reasons this can happen and now um the problem is if we don’t get a response for the cancellation request uh we don’t know what the actual state of the inventory is right we don’t know if the seats were actually freed up or if
They’re still reserved and if we don’t solve this like if you say oh let’s just do you know eventual consistency the problem is we’ll end up with trains uh whose seats are blocked by those reservations who have never been cancelled and so it looks like the
Entire train is booked out but if on the day of uh you went physically on the train you would find that it’s empty because it’s all you know uh cancellations from a distributed transaction that didn’t actually get through now this is not a new problem or anything in distributed systems the way
You would solve it today is for example you would use the Saga pattern but then again this one requires to have some sort of uh you know middleware for example a messaging or event system obviously EPA does not support them another one is you could do the
Traditional way you can do an actual distributed transaction there’s you know the two-phase commit protocol for that XA transactions for the Java guys for example of course IA does not support that either and so what we had to do was to essentially build our own you know transaction scavenger that uh keeps note
Of all the right operations on EPA and whether they failed or not and then you have to do you you have to keep doing comp compensation requests until you finally get a response that either says you know what you’re doing is stupid I’m already done and then it’s fine or it
Needs to keep retrying and so we had to solve that as well and I mean anyone who has ever even considered doing his own transaction management knows how much of a pain that was but still it needs to be done right and so um the story so far is
That originally what we started out with was we had you know our sales platform and EPA and we drew the line and now all of a sudden we have this but none of the things I’ve talked so far is you know I didn’t make any of this up like right I
Mean everything I hope at least makes sense like it’s it’s just uh the consequence of distributed systems I guess and so um the hard part or the sad part uh depending on how you want to see it is we didn’t even look at any of the functional requirements yet have
We yeah and so at some point when you do reservations on trains you need to actually take a look at you know the physical inventory we have like what do the trains actually look like because on our tickets our customers they uh we print to them you know you’re sitting on
Coach I don’t know 21 C1 by the window uh maybe you have a reservation you need reservation if you bring a bike for example maybe you seated it in a family Department compartment or anything and to solve this there’s something called the fat soic lexicon this is publicly
Available um uh it translates to it’s sort of an encyclopedia of all the trains in the inventory of deuts aan and uh in the introduction it says that it’s essentially a concise summary of you know all the trains what do they look like and you can find all kinds of
Interesting stuff like I took uh a picture here this is from one of the newest generation of trains ic4 we have it has all kinds of information for example it says you know top speed uh how many coaches does it have and then there’s the entire layout you know um
You have for example restaurant on board then you have the different kinds of coaches different kinds of compartments which way are the seats oriented and stuff this is unfortunately things you need to uh think about when you do the seat reservation stuff and now as always um unfortunately Details Matter and
Those of you uh who have a bit more experience probably noticed this what I showed you was page 117 of 269 and that’s just German trains and you need to like if you want to do reservations on all trains that go day in they out you need to somehow
Support all of them no way around it unfortunately and that’s why in the past I guess the attempts of you know creating a new reservation system have been so hard so uh the lesson learned here is that you know details matter and doing this broad stro uh broad Strokes
Approach of doing you know uh Legacy replacement and Legacy integration is slippery slope to say the least now um I want to show at least one of the interfaces in EPA um and it’s actually so over the years obviously there have been new ones that came piling on but
Since uh the core functionality is making seat reservations it’s one of the oldest ones and it’s the API or the interface to actually making a reservation on a train and what it takes as arguments is roughly uh you tell it what train I chose the example of IC 690 um I think
This one goes from Munich to Berlin and then it asks you okay on what interval on what segment do you want to make reservation I chose Frankfurt to Berlin and it need to know on what day and how many people and a bunch of other stuff but basically that’s it and um the
Question to you is what do you think this interface would look like in a system that’s been 40 years old and actually I think I have time here uh I made it a mission to give you a minute and actually see I have so much space here for
Activities like it’s it’s really big really I’m sorry Mr cameraman I’m sure he hates me right now I go back to the podium right and the way I’m uh let’s break the suspension here the way it looks is like this like this is how you make a reservation
Request to EPA given the parameters and it actually you know you can’t obviously see it from where you’re sitting um but I actually looked in your faces and uh reactions again have been quite similar to what we had you know it’s some people are like okay that’s funny some are like
That’s interesting and some are a little shocked because uh realization is that you know we’re going to have to build a mock for this yeah obviously isn’t it like whenever integrate a third party system you want to do testing you got to have a mock for
This so what are we what are we looking at it’s uh a bunch of numbers at least uh it looks like it but it’s actually uh heximal uh numbers and what you can do is uh you put this string into a converter that takes hexadecimal and
Converts to as key and that’s not really what it is so basically it’s it’s uh it’s encoded in hexadecimal way and parts of it are human readable if you do the conversion like I’ve tried to highlight uh the parts where you can see how it works like uh do my clicker thingy
Work ah there it is okay so you have four people here right and then you have the train number 690 and then you have a date July 14 the time 12:14 you’re going from Frankfurt to Berlin uh it’s a booking in EPA and the train is an IC
Right so that’s basically the way it works uh and this brings me to the topic of doing testing um so we need a mock for doing this and uh in the integration itself and in the mock both what we’re going to do is we’re going to have to do a translation
Of the binary stuff into something human readable and then you know do some Logic on it and then uh re-encode it uh to pass it to EA or to you know make up a response and the way we did that is we used Wok for this the Standalone version
That you can run as you know a Docker container and there’s actually a cool feature called custom Transformers and the way it works is um you tell it to match a certain number again I have to look here uh you make it match to certain requests and you tell it which
Request so it’s this URL path and then if it’s a post request and if it contains a certain header and in this case we wanted to match all requests with the MIM type application octet stream did anyone know this MIM type oh okay great uh the official documentation on the RFC says octet
Stream is for arbitrary binary data and I I think that’s a perfect like that couldn’t couldn’t sum it up any better okay right now uh I’m going to get uh to the end here slowly um so originally the question was you know um how do you
Build how do you come up with future proof software and you know I use this obviously as uh as an example and um there’s still though a whole bunch of things uh I think we can learn from this like the first one for example future
Like why why is it so hard well a famous quote comes to mind which says prediction is very difficult especially about the future and that’s true because you can’t know what’s going to happen no matter how hard you try you know there’s always going to be things coming up that
You didn’t anticipate and that’s why this is obviously so hard so um the best you can do is to try and keep things on the one hand as not fancy as possible right that’s one of the biggest learnings we had because whenever they decided to do
Things in a fancy way you know either be it um API design be it you know the documentation stuff they did this always gave us a headache so really I mean everyone knows was about kiss principle but this really really shines through here um and if for some reason like in on
Occasions there are good reasons for doing things that are fancy then at the very least you should make sure to document why you did it for example architecture decision records would be in my opinion a great idea for this because the way it is with uh what we
Had there was no one around anymore to explain to us why it worked the way it did right and so if you do in fact do decide to do something fancy for example good reasons might be you know performance reasons if it’s really pretty critical um if there’s you know
Uh technological reasons there are some but then please make sure to you know uh give people an idea of you know what was your choice you made what were the options that you weighed and for what reasons did you decide to go the way you
Did and uh closely related to that is um we actually ran into the Trap um of not managing knowledge in the organization itself so the way it is um with EPA the few people who are still around and ran it they treated it as sort of a um
Blackbox I guess you know they couldn’t really explain the way things worked um but they accepted that it did mostly do what it was supposed to do um and actually what happened was that you know at some point we need had to get down uh in the trenches and do the heavy lifting
Of translating all the binary stuff and so on and that period went on for like I don’t know maybe a year or so uh and the people who were involved with doing uh the in-depth things of doing the translation they moved on after a while I mean you’ve seen that we’ve been
Working on this for five maybe six years and if the people who you know uh did the uh the the difficult part of you know uh thinking through it to understanding the way this translation Works uh they move away that’s what happens right and if you don’t make sure
That the knowledge is spread uh in your team then it’s going to leave with your people and then you know we have our mock solution and we have obviously our implementation of doing the translation and stuff and then this becomes a blackbox in turn because no one is
Around anymore to understand how does the translation work because no one’s around anymore to understand what’s being translated and so you need to be very careful to do uh an active you know Knowledge Management in your teams and in your organization and actually it’s the other way around as well
Um once the implementation is in place uh people leave new people join your team and they start working on new features and every once in a while it’s going to be required that they uh take a look into the deep down stuff and they’re going to be like uh that’s very
Weird I want to try and avoid that right and you need to do a lot of pair programming actually you know guide them and introduce them to us because then otherwise everyone’s going to be uh I’m going to try to avoid that that and then you’re running into the same trap that
They did in the past now um another thing is that you know uh I guess it it comes back to the question of like is there such a thing as future prooof software that’s going to run forever and uh the answer is obvious really like if you have the if
Your idea of future proof is that you build a system once and you do it really well and then you just keep it to you just you you know put it into a box and keep it running for 20 years and never touch it again obviously there’s no such
Thing right um and then what it really takes is you need to do you need to be aware that it requires managing the life cycle of your software I think there’s two ways you can approach this you know either you accept that you need to constantly invest into it you need to
Constantly uh solve all the technical and domain depth that will uh inevitably pile up over time you need to to address it or you can go the other way and say okay I’m doing like um planed obsolesence I accept that this is going to be broken or unmaintainable in 5
Years and you know I have a plan that in 5 years I’m just going ahead and just replace it entirely that’s way you can go as well I guess uh in general uh my advice would be to be very careful of doing you know these broad
Strokes I obviously like it was a bit tongue and cheek when I gave the Enterprise architect thing um but still you know we have a tendency to oversimplify and this is going to bite you uh in the long run you know um and oh yeah third one um about the
Migration effort like I talked a lot about the integration when it comes to replacing Legacy systems unfortunately what I see very very often is people look at you know the thing the Legacy system they have and they take inventory of all the stuff it does like they make
A map of all the features it has and then what they do is uh they build a new one that replicates all the functions the old one had and then pile on new stuff and that’s the worst thing you can do right what you really should be doing
Is you should be cutting off any of the old ties you had previously because that’s exactly what got you into the mess pre the last time why would you do the same thing over and over again and then end up with the same mess again you need to learn from your mistakes right
And so you uh you have to be and it’s actually hard it sounds very simple in principle but when you you know talk to the people who who used to work on the old system every time you talk about new features the first thing they’re going
To be saying is you know the way we used to do this was this and this and this and we had processes that look like that instead what you need to do is think of new processes think of better ways of doing things it sounds very simple but I
Can tell you it’s actually in practice quite hard because people are very averse to change and now finally actually what I can say is that you know every time I’ve been working with any Legacy systems because it sounds boring and often times people will say oh you know Legacy uh I
Don’t care about it every time there’s been so much I learned from this like in this example you know starting with distributed transactions we had the whole TLS handshake thing the protocol we had you know timeout chain circuit breakers service mesh caching there’s so much fine-tuning we had to do like with
The load limitations there’s so many parameters you can play with who have you know do small tunes and they have large impact to me it’s always been you know actually I learned so much so yeah I guess you should really see it more as an opportunity than a challenge I guess
Right now that’s me with two minutes left I think right yeah and I’ll just skip this one uh and get to the end uh thanks so much for sticking through um for those of you who are not you know thirsting for the free beers that have been offered to us I’d be
Happy to answer any questions and maybe you have your own views or experiences to share thank [Applause] you yeah uh thank you for the talk um maybe uh I missed it in the first place but why did you have to integrate the old system and couldn’t simply write a
Completely new system with the new features and without the old ties I think I I mentioned this so the idea was that we wanted to go into production as quickly as possible with our sales platform and so in order to sell tickets that are valid like in on the train like
You know when they uh make checks and stuff you need to have also be able to sell uh reservations along because we need to some there are some connections where reservation is mandatory and that’s the reason yeah um it’s not really a question first
Of all thank you for the talk I find my own experience is resonated but I also like to work on Legacy systems uh just a quick story to share you were actually lucky because you had somebody and you had some specification my favorite my favorite experience was when the
Customer said we don’t know what it does but it has to be replicated exactly in the new system okay and then you have to do comparison test essentially you had the old system running some input data and the new system had to produce the same output but you had no idea actually
No specification crazy consider yourself lucky okay actually if you stay around for for a bit maybe let’s talk after I’d be interested to hear about that last question well uh maybe the main challenge is to create the adapter in a way that is not a replica of the pure
API of your legacy system yeah absolutely yeah for sure how did you tackle that uh we so we’re doing consumer driven contract so uh in a way uh you know the domains that do the booking process they dictate what our API should look like and the adapter needs to handle Translating that into
Binary stuff okay thank you okay I believe time is up that’s it again thank you so much um and to those of you who are leaving by train may your trains always be on time thank you