Meet Graham Lee, who has decided to investigate research software engineering  for his PhD thesis. I met Graham at the RSE conference in Newcastle, UK late 2022 where he presented his work in form of an “inverted” talk (i.e. driven by the audience). 
Since there are not many people who make RSEs a subject of study and a PhD thesis, I caught up with Graham after the conference. In our conversation we discuss his thesis and questions such as: is RSE actually a distinct role/discipline and what the future might hold.

Talking of conferences:

• UK RSE Conference https://rsecon23.society-rse.org Swansea, UK, between 5-6 September 2023.
• NOTE: submissions of abstract deadline extended to 3 May 2023! 
• Keydates: https://rsecon23.society-rse.org/key-dates/ 
• US RSE Conference (the first!) https://us-rse.org/usrse23/ Chicago, IL, USA, 16-18 October 2023
• deRSE Unconference https://un-derse23.sciencesconf.org/index Jena, Germany, 26-28 September 2023.Support the show (https://www.patreon.com/codeforthought)

Thank you for listening and your ongoing support. It means the world to us!
Support the show on Patreon https://www.patreon.com/codeforthought

Get in touch:

• Email mailto:code4thought@proton.me

• UK RSE Slack (ukrse.slack.com): @code4thought or @piddie

• US RSE Slack (usrse.slack.com): @Peter Schmidt

• Mastadon: https://fosstodon.org/@code4thought or @code4thought@fosstodon.org

• LinkedIn: https://www.linkedin.com/in/pweschmidt/ (personal Profile)
• LinkedIn: https://www.linkedin.com/company/codeforthought/ (Code for Thought Profile)

This podcast is licensed under the Creative Commons Licence: https://creativecommons.org/licenses/by-sa/4.0/

Welcome to another episode of Code for thought in this episode I’ll be talking to Graham Lee from the University of Oxford in the UK and talk about his PhD thesis about research software engineering but before we go into the interview there’s an important announcement I need to make the society

For research software engineering in the UK will be holding its annual conference in SW this year between the 5th and the 7th of September and maybe you’ve seen this on the website already or received an email now if you plan to submit a paper for a presentation or a poster

Session you’re in luck because the conference committee has just extended the deadline for submissions to Wednesday 3rd of May 2023 so if you’ve been sweating it to get your abstruct out you’ll have an extra week it’s probably unlikely there will be a further extension so please don’t miss out again the call for

Submissions for the UK rsse conference is extended to the 3rd of May the website and details can be found in the episode notes all of which is actually a nice segue into my conversation with graham because Graham and I met at the previous rsse conference in Newcastle in

20122 Graham gave what’s known as an inverted presentation basically letting the audience Drive what the presentation will talk about the subject was to find out what and who RS e are which is of course the main subject of Graham’s PhD thesis and at least to my knowledge the

First PhD study to be conducted about rses rather than working with rses so who are we then well let’s turn to my conversation with graham now hello Graham and welcome to the show and thanks very much for your time before we start um can you introduce yourself

Quickly my name is Graeme Lee I am doing a um well we call it a dill in Oxford because obviously Oxford has to be different basically a PhD on Research software engineering and RSS and their work for three years I was a senior rsse in the uh Oxford rsse Group as well I

Think that’s quite interesting because I don’t know how many people actually did a PhD on research software engineering specifically were you assigned a software engineering role or was that about software engineering the the role that he had as an rsse in Oxford so I was I I was recruited into Oxford to

Work as an rsse the oord rsse group had been running for approximately a year with I think two people and they were trying to grow this group and had some epsrc funding to do work uh within the university so I was hired on that basis before I’d worked at Oxford I’d had a

Fairly long about 15E career as an industry software engineer and I heard completed an MSC in software engineering also from Oxford you know I had that sort of Industry background of knowledge which is what they were interested in sort of bringing into the group and the doctorate sort of arose out of that

Because if the thing I’m being asked to do is take all of your industry knowledge of software engineering and apply it in this context I sort of naturally had the question well is that the right thing to do you know is this the the knowledge that’s needed and as

You say that this isn’t an area that had actually been perhaps sort of systematically covered or researched and so it looked like a good opportunity to to give up on a social life for a few years as one Darman one doesn’t appear exactly I mean it’s quite interesting

Because it’s almost like a meta thing about research software engineering isn’t it and the reason we talk about this is because I met you at the research software conference in Newcastle in 2022 and you gave a talk well not so much a talk as an interactive session on

Who is a research software engineer so is that part of your thesis actually sort of trying to find out who we are yeah exactly um because I think that there’s there’s an interesting question in there you know if you look at the Society of research software engineering

It sort of has like two parts to its mission statement one is to improve the construction development what whatever you want to call it of software in uh research or used for research and the other is to carve out a career pathway a structure some sort of you know like

Stable employment model for for resource software Engineers now in order to do that second part we have to like understand well what are the careers you we know that it’s actually a fairly Broad and I think deliberately fairly broad label that is kind of freely available for self-identification you if

You are working in research and you do software you are allowed to call yourself an rsse and align with this this movement this you know join the society but then there are also people who are employ on like contracts say research software engineer or that have a a job title with research software

Engineer not all of whom are doing the same job as well you one of the common distinctions that’s raised is between people who are in a sort of pool group where there’s like a you know a research software engineering service in a university that researchers can come to

Work with or to get help or support and then there’s What’s called the embedded research software engineer who who is much more akin to post-doctoral research assistant whose focus in contributing to the research happens to be on the software related tasks so there’s already like those two distinctions

There’s already a probably a Continuum between those any particular individual may have bits of their job that look like one and bits of their job that look like another soort of is there actually a thing that we can call either research software engineering or a type of person

A sort of category of person working in research that we can call the research softer engineer then there is a third kind of role that I found when I talk to people particularly in countries where the funding is a little bit thin on the ground when it comes to research

Software engineering as such where people might identify as a research software engineer but they pretty much fly under the radar so they’re actually officially researchers or they have a fellowship on on some kind of research subject but I think in the UK it’s now gradually getting a little bit different

Now or what did your PhD find was that actually part of your arit finding out what kind of roles we have in the UK the motivation for the work that led to this tool was actually trying to understand whether research software engineering exists as a profession that is distinct

From some other role for example like a post doctoral researcher so then we ask the question of like are there any boundaries and how fluid are they you know is there a point where you can say right if you’re doing this work then you

Are an rsse but if you step over by one task towards like writing research you maybe you contribute one more paragraph to a paper or something it’s not going to be that clearcut but you know it’s something that turns you into a researcher and means that you’re not a

Research sof engineer and I’m looking at that through four characteristics which the research literature on the social sciences of professions identify as being the sort of Hallmarks of a distinct profession uh which are it has a distinct body of knowledge I.E there are things that you would expect an rsse

To know that you wouldn’t expect a researcher to know or you wouldn’t expect an industry software engineer to know that there is technical autonomy I.E rses get to choose how they’re working is done and what they do in order to fulfill their aims as an rsse

That there is a distinct sense of status or privilege associated with the role the fourth of these properties and this one’s really hard to summarize because it’s got a long- winded phrase is a normative value of service to others or society and what that means is that the

Profession in this case RSC has a collection of like values that are shared and an understanding of what the work does to benefit the people that they work with and that that has some sort of independence of the work that you’re given by you know by your

Supervisor by your manager by your right your client so if if you take a more established profession you know law is an example of an ancient profession you can see all four of these properties like you need to understand law before you’re allowed to be a lawyer and there

Are exams to make sure that you do there is autonomy um you go to a lawyer and say I want a divorce or I want uh conveyancing done at my house and then they go away and do that they don’t say what documents would you like uh what

Order would you like them in uh they just do the work and then show you the results they have a distinct state you know there is some kind of idea of privilege associated with being a lawyer now obviously there’s also views on sort of shady lawyers but then that’s where

This fourth part the normative value of service comes in because the law profession has an ethical code it’s actually quite strongly enforced because there is like a professional society and you have to be a member in order to practice and you can be removed from that and therefore unable to practice

You know if if you are found to be acting unethically we kind of meet some of the criteria don’t we I had a discussion a while ago with somebody is rsz actually a discipline right and the we weren’t really quite sure whether it is or not because I think in the

Beginning you mentioned how distinct is it it kind of is distinct from an academic career because it’s not just about the research it is about research as well but how different is it for instance from uh soft Engineering in the industry um is that really a difference

Do I need to have special skills as a research software engineer as opposed to somebody who works in an office looks after a website or creates code for an application and the answer to that is and this is one of the interesting parts of the research I’ve done to date is you

Will get different answers from different people and so it’s still fairly inconclusive so let me just like summarize my methodology quickly we can get back uh if it becomes relevant but basically I interviewed a number of different people I contacted them on the society slack then also like emailed

People who were in particular RSC groups and so I got a range of responses from like people who had literally landed into a role two or three days before they did the interview um people who were experienced rsc’s people who managed rsse groups and then some people

Who work with RS like someone who works at a vendor on on their interactions with like research groups and a journal editor for a data science journal that has a lot of resarch software content so of those people the ones who had experience in Industry tended to say

That you know the the nuts and bolts of it the software engineering is basically the same you you’re writing Python and it doesn’t matter whether you’re writing python for a company or for a university you’re using Version Control tools get doesn’t change where things tend to differ are

Obviously the um the salaries came up yes indeed um and the sort of project management from a sort of software project management perspective the workflows the processes the amount of checking in and ref refinement the ability to State any idea of what the requirements are either up front or as

You’re going differ in these two contexts as you would expect because if you think about a product then the requirements are slightly different to a research project which is usually Limited in uh in its lifetime research project has funding maybe for three or four years and maybe there will be

Something out of it that can then be used for other research projects so it lives on a little bit but in a product you hopefully create something that lives on that creates revenue and for years to come yeah and like the the outcome of the software product is a

Product in research the product may be answers to questions maybe further questions it’s a lot harder to pin down when research is done which means you typically just like agree you with your funding body how long you’re going to go on for and what you expect to happen as

A result of this and then you report back on what actually did happen so from my experience and now this is sort of like leaving the research findings aside and just reflecting on my experience in both research software engineering has more similarity with early stage startup companies than it does with the

Established uh companies you know if you think about Microsoft Word we know what we want from Microsoft Word it’s existed for 40 years now it can get more stable it can become easier to use but you’re adding things that customers want from a large base of customers if you look at a

Like a seed startup the product is almost irrelevant what they’re trying to do with their money is validate that they are asking the right question now they are validating it in a Marketplace they’re saying can can we find something that we can get customers for and they’re doing it by building prototypes

By doing user experience studies you know essentially by doing some form of research and development but the product doesn’t come until later when you’ve answered a load of those initial questions and sort of converged on something that your team your investors and your customers actually agree is the

Right question to be asking now that is a lot more similar to what’s going on in research except that there are also RSC who are working on like fairly large infrastructural projects but I think it’s finally quite interesting that you try to compare that with startups from

Your perspective but going back to the study so how many people did your interview and how many people did you talk to for your PhD so it was a fairly small number because they were open-ended interviews which means I was talking with people for about an hour you know starting with questions like

Which bits of your work do you get to choose how you do in order to explore this idea of autonomy it was a a dozen people in the initial round most of them either were employed as research software Engineers or were considered themselves to be research software Engineers despite not actually having

That title and then a couple were managers or group leaders I think there were four of those so you sort of like 50% actual people who are on the ground doing RC maybe then like four people who are leaders of people who do rsc’s in

One way or another and then um a couple of people who work with rsc’s you mentioned somebody from a journal so what made you choose somebody from a journal I mean journals obviously play an important part in science as we know because it’s published or Parish but why

A journal in terms of resar software engineering because part of the the question over the autonomy or the separate identity of the IRSC role is how separated research software Engineers are from the sort of motivating factors of being a researcher you know from being evaluated as a researcher effectively like it both came

Up in my research and like in informal discussions I think even at the the conference talk you I want to be doing the software but I’m measured on how many papers I write to some people the ability to contribute to research is very interesting one one of my

Participants and of course the data was collected anonymously so I’m not going to identify who it was one of my participants uh described research software engineering as being a professional middle author now that implies you know that the role is connected to the production of and the publication of research output in

Journals there’s a bunch of interesting questions there from the perspective of the journal Machinery itself and thus the editor can you publish software without having to without it being attached to a paper what does that mean software is a sort of evolving thing it it isn’t a tangible product you can

Always change the source code so if you publish software does that mean you’ve frozen it if you haven’t Frozen it then how does the publication event map onto the ongoing timeline of the software and then also like even questions like what requirements can slash should a an

Editor put onto the software should they require that the um source code is deposited in an archive can they require that any researcher can run exactly the same software and get the same results which maps on to these ideas of fair which is um findability accessibility interoperability and reproducibility and if like General

Editors aren’t currently making those requirements but think it would be a good idea then you have questions of like why not and what’s the pathway to get there and do research software Engineers agree that those are good ideas and would like to get there and do the researchers that they’re working

With agree and that it’s a good idea I’m sure we uh s come on to this question of like why are we still talking about this 10 years after rsse was like indeed was coined as a term you find that you have to move the entire system in order to

Make this change to fairly important but you know small part of the system it’s appropriate to ask who else is involved what change are they trying to achieve and how does that complement or conflict with the changes that we in the RSC Community are trying to make

From that point of view it’s quite important that you do include journals and in fact and as you quite rightly say journals play such an important role in science the recognition of science but it’s an interesting aspect of course what value does a paper that contains software which probably won’t be a paper

But maybe a link to a GitHub reer or really half in terms of publication because the moment you publish it may be the moment it already changed right and then you what does that mean for like the period revie process because presumably the peer reviewers I was going to say presumably the peer

Reviewers saw the version that existed at the time of submission but maybe the thing was being changed even then do the peer reviewers in well even a data science journal but let’s say that you’re a computational biologist now or an RSC in a computational biology group do the peer reviewers that are engaged

By a biology Journal have enough software engineering knowledge themselves to actually perform a valuable review to decide whether the software as part of the publication artifact you know contributes to the science or not what I want to discuss next is recognition and that takes us back to the conference because the conference

Actually had a number of sessions where it was asked who are R’s that was your session and I think yours was followed by somebody else who did a survey and then the following day I remember there was a whole Workshop on who are rses and I was wondering that it’s quite odd

Because we’ve been at it for 10 years at least in the UK we were and we now have positions so why are we still talking about this then partly it comes down to this idea I mentioned before of having to bring the whole system around you if

You imagine that like you know any individual rsse has a lot of influence on the groups that they immediately work with and some influence in the institutions and like the academic community that they’re based in and then some tiny amount of influence on the entire edifice of like the funding

Institutes and universities UK and the management and all of the thing that’s going on so we’ve got this massive containership if you like and we’re trying to turn it around to point in a different direction and we are hopefully doing that a because we think that it

Would be better like you I think that probably most rs’s agree that better software will lead to better research and also hopefully and this is perhaps the bit that’s more relevant to my research hopefully we all agree enough on what we mean by better and on which

Direction we think we should point the containership that we’re all trying to steer it in roughly the same direction and that that’s going to lead to it changing there isn’t any CEO of research in the UK you can’t just like get one person to believe in rsse and then they

Put out an edict that says we’re going to do this part of the reason we’re still here 10 years later is just the amount of time it takes and part of it I think is that there is I’m not saying that there definitely is a a conflict

Here but there is the potential for conflict and this is one of the things that came up in the panel discussion for people who weren’t there my session at the RSC conference was what I called an inverted panel so I presented about 10 minutes of what we’ve been talking about the findings

From my research and then just said okay now what do you think this idea came up you know what we’re trying to do is to improve the software engineering capability in universities what is sometimes happening I don’t know how frequently but like I get enough anecdotes about this I think

It’s probably happening quite frequently is that RS are seen as people you can Outsource your software problems to um yeah a consultancies basically yeah exactly another aspect of how I did this research was I invited all my interview participants to group discussions which were held under chatam house rule where

You can report on what was said but you can’t report on who said it to see a summary of the data that i’ gathered and to like then discuss that and to see whether what I had concluded from it may sense one of these there was this very

Fairly heated discussion took up a lot of session on whether the idea of service like the idea that rsc’s are providing a service to researchers implies servitude and this idea that in what one person described as the snobbish academic culture in the UK by which I think they mean the sort of

Hierarchical nature where you have like you know the genius with the Blackboard who then just you know has some minions to go do the the work for them whether IRSC is in danger of being seen as a minion and not as a peer collaborator who is providing intellectual input into

The research you if what we’re trying to do is to like improve the way that software is done what we’re really trying to do is to sort of disseminate software engineering knowledge and understanding of the challenges and the solutions to those challenges throughout Academia but the natural thing to do is

To is someone say well I don’t understand this you do understand this can you do this for me and actually like on a sort of tactical basis on like a short-term project to project basis some of us might enjoy that you know if we’re told you know go into your cave and

Write the software and then tell me when it’s done and what we like doing is writing software which is why we sort of gravitated towards these roles indeed yeah that might be quite appealing but it might also fail to achieve the Strategic goals it may fail to dis cinate this knowledge because we’re

Actually just doing it ourselves yeah which brings me to the slogan of better software better research because then we need to talk a little bit about what we mean by better software better software written by people who actually are software developer experts if you like or better software across the board and

What does better mean in the first place right and yeah I mean that’s a sort of challenge that I pose to some of my participants that you know they say indeed I remember that try to make it better I’m like well what what do you

Mean by that often there’s a sort of I see this in Industry as well there’s a sort of I’ll know it when I smell it definition of better we have these sort of uh tacet models especially when it comes to I’m going to up my pretention

Points here and site the book Zen and the Art of Motorcycle Maintenance someone once told me if you think you know what the book said in the Art of Motorcycle maintenance was about then you didn’t know what the book was about it has these um ideas of quality in it

External quality and internal quality there’s a section in the book where there are two motorcyclists and one of them has like a really shiny bike that’s like a work Advantage you know like a sculpture and they really like sort of keeping it clean and riding this bike

And the other person has a bike that’s you fairly beat up but that they know how everything works and like if something breaks on it they can like take a bit of it apart fix it put it back on and carry on going these are like the external quality what you see

And perceive of the artifact as a whole and then the internal quality what you understand of like the workings and the inner parts and you can apply that to software as well indeed exactly in fact you can apply that to research software engineering you have the people who go

To the garage so the researchers are say give me some software and then you have the people who understand what they’re doing and ride the software themselves yeah so is better software better because it’s easier to work on or better because it you know provides a better

Experience for the people who are using it or is it both how do you trade off working on one versus the other from the internal quality perspective we could look at something like maintainability and that is something that is important to internal software quality in industry

Is have I developed this thing such that when my marketing iror comes and says 55% of our customers want this new feature can we add it in a reasonable amount of time now in software in research software sorry we’ve already talked about the fact that like there is some infrastructural uh software where

We’d want to be able to do that if everybody needs new feature in pandas we better be able to add that feature to pandas on the other hand research software also involves a load of scripts that are written once decide whether some data coming from an experiment is worth further investigation or generate

Figure three in a paper that software doesn’t necessarily need to be maintainable because nobody’s ever going to touch it again except that if you treat it as if it’s never going to be touched again you can guarantee that it will actually be used else the worst case scenario always

Happens so it means that this question of like what is better software or what is quality software is situational like I said I think that many people have faet understandings of what they mean when they say better software or quality software but I don’t think that there is

Necessarily a shared framework for like research software in this context needs to be treated this way or research software that is going to be part of infrastructure needs to have this set of sort of quality standards Supply well it reminds me of an interview I had with

Derek Jones who wrote a blog post called is research software engineering a tangled mess which basically uh touched on exactly the same point that a number of people are writing software you know because they need something for the research project with no view on maintainability and sustainability and

All the other aspects that you’re looking for when you’re designing and developing a product so uh over to you do you think that research software is a tangled mess I think it definitely is like whether or not we are untangling it is the interesting question I mean I

Remember reading Derek’s post I think he was fairly pessimistic but then you you can find other people like Ben Golder has recently written a report where he talks up the importance of research software engineers and the and the contributions they make you know you can find like optimistic takes as well I

Mean I would say that this is not a problem that is unique to research software and that commercial software is also a tangled mess let’s go back to that example of like the seed start up company they’re going to try some prototypes you mock up some uis try and

Work out where what they call the product Market fit is I where the thing that they want to build and the thing that customers want overlap and then they find that and they’re going to build a product now a disciplined way to ensure quality at this point is to say

Okay well the the Prototype was a useful learning experience now we need to build something that customers will actually use so let’s just take that knowledge and start again and and build a thing that is designed to be used by customers and extended into the future and to have

A l which of course doesn’t happen exactly it doesn’t happen we do instead is go okay well customers will buy this thing tomorrow can you put the Prototype onto a website somewhere and start selling it and then we’re in the same boat that the RSS are in they wrote a

Script to like support the writing of one paper and then five years later someone else is still trying to use it I’m fairly optimistic that like we are working in the same direction there is actually a lot of maybe implicit consensus but consensus nonetheless that

What we need to do is to improve the training availabilities to researchers in general about how software works and there is coalescence around particular ideas like fair for research software like reproducible research and then reproducible software environments that is going to lead to generally a raising of the water for research across the

Board because like these things are going to disseminate out they’re definitely happening at different rates in different research disciplines and countries yes different countries as you said the UK was you know out in front in sort definition of RSC and it’s not yet happening everywhere else and also it

Isn’t happening very quickly and so like someone who’s just going right like what is the sa of software in 2022 or whenever Derek wrote that pose yes I still see these problems that’s an interesting point itself you it is the end game for rsse that there are no RSS

Because the research Community is now capable of doing software for itself or is it that there is some sort of stable state where there are a number of rs’s and a better standard of software in research I’m going to leave that one for the comment section I think

Not yeah well well actual Factor that’s a nice segue into the final question which is is there ever going to be a uh established rsse roles is it ever going to be an untangled mess and if is that even in a desirable state to be in there’s obviously need for software and

Research you need to have people who are able to write software but maybe the state can change and everybody will be able to write software going back to the motorcycle example you gave I think it’s easier to say that I would like to see an explicit and shared understanding of

What it means to make software better so that we can all understand whether we are working towards that or not and in 10 years I would like to see that being expressed I don’t think that that’s actually that far away I I think that the society for research software

Engineering are working on that I think the software sustain ability Institute working on that I think there are various working groups in various research disciplines who are all trying to identify that and so some kind of Manifesto or convention on software in research is maybe only two years away

Which means that there’s then eight years of I’ve got eight years of spare time now what happens in that I would say well it’s all about adopting and internalizing those values in research communities the rsse community as it exists now will play an important role in that because you know we’re the

People who understand how software Works how software engineering Works how to run software projects so we’ve got a head start but I also think training and not training for new RSC training for new researchers PHD programs uh masters programs looking at making software as much a part of the next generation of

Researchers skills as writing a paper is there are very scientific disciplines and actually I’m going to definitely attribute this quote to my supervisor David gavan because I know that he said that there are scientific disciplines where using a lab book is absolutely non-negotiable it is part of the protocol software is essentially an

Online form of labbook it is a method it should be as non-n iable as lab book is in addition to teaching people to write lap books uh which is what we do in courses if you do a practical in biology or in physics even we also need to teach

Them software yeah or basic principles of that yes not every researcher is going to need to use software every day but then even when they’re communicating with like let’s say there are still rsc’s in 10 years time you still need to understand enough about how the process

Works in order to Be an Effective client so I would like to close with a question about your PhD so how far away are you from finishing it oh dear that’s a question you never ask a PhD student isn’t it when can you get your social life

Background so so I I I’m working with uh three very good supervisors David gavan I’ve already mentioned Helena Webb at Nottingham University and Susanna Sansone who’s in the Oxford e- research Center in engineering science they’ve all been incredibly helpful and you got me to the point where I am now I’ve been

Doing this for two years around a job as a full-time RS or software engineer I would expect to have another two years at least well thank you so much for your time Graham I wish you all the best for the future and for your thesis thank you

Very much Peter this has been a great conversation thanks for having me on if you want to find out more about the wondrous world of research software engineering and want to meet like-minded people or indeed join the community why not meet us at our conferences as I

Mentioned at the beginning the UK will hold its annual meeting in Swansea between 5th and 7th of September the rsse community in the US will have the very first in-person and hybrid conference in Chicago between the 16th and the 18th of October and if you

Happen to be in Germany and vade at the end of September there will be an unconference in the city of Y between the 26th and the 28th of September oh time’s up see you next time but before I forget this podcast is covered by the creative common license see you

Share.
Leave A Reply