Interview with Vladimir Potapov: The NEBridge Liagse Fidelity Tool

Transcript

Interviewers: Lydia Morrison, Marketing Communications Manager & Podcast Host, New England Biolabs, Inc.
Interviewee: Vladimir, Ph.D., Bioinformatics Scientist, New England Biolabs, Inc.

Lydia Morrison:
Welcome to the Lessons from Lab and Life Podcast brought to you by New England Biolabs. I'm your host Lydia Morrison. I hope this episode brings you some new perspective. Today I'm joined by Vladimir Potapov, a bioinformaticist and 12-year NEB employee, an important part of the team that builds online tools to aid both NEB scientists and our customers in calculations and experimental design.
Vladimir, thanks so much for being here with me today.

Vladimir Potapov:
Oh, hi, Lydia. It's good to be here.

Lydia Morrison:
Could you introduce yourself to our listeners and explain what it is that you do here at New England Biolabs?

Vladimir Potapov:
My name Vladimir Potapov. I'm a research scientists. I'm in the research bioinformatics group. At NEB, we have a large research department with various scientific divisions. Part of the role of the research bioinformatics group is to interact with scientists with the research department and help them analyze their data and experiments and conduct our own research.

Lydia Morrison:
NEB offers a variety of open access tools that are available online to support life science research. Could you tell us about the tools that NEB creates and makes available for the public to use?

Vladimir Potapov:
Yes. There is a whole set of tools aimed at different users at different aspects of the work. For example, you can go through the product selection tools that help you choose a particular product. Or for example, we have a set for the restriction enzyme tools where you can find the optimal enzyme for your particular experiment, or you can find the buffer that is optimal for those enzymes.

We also develop tools for other people to, for example do DNA assembly, using NEBuilder HiFi assembly, or Gibson assembly, or golden gate assembly. We also help people to design primers for metagenesis, and different calculators just to do different conversion between RNA DNA and different physical chemical properties of those molecules.

Lydia Morrison:
Really, a wide variety of tools that range from everything from helping make calculations, to helping design primers, to full experimental design?

Vladimir Potapov:
The goal is to simplify work of other users, either to analyze their data or to design their experiments. Some of those tools we are using internally in the research and when it's relevant to people outside. We provide also the tools for the customers.

Lydia Morrison:
Yeah. These are really valuable tools that I know our customers really appreciate all the effort that you and the other bio-informaticists and software developers at NEB have put into creating.

Vladimir Potapov:
Exactly.

Lydia Morrison:
You were integral in the development of our NEBridge Ligase Fidelity Tools, which are used for aiding in the design of golden gate assemblies. Could you share with us a story about how that suite of Ligase Fidelity Tools came about?

Vladimir Potapov:
There's an interesting story behind Ligase Fidelity Tools. Actually, that goes back to the beginning of what I said, that at NEB we have a large research dept and we conduct a lot of research in purely basic research and applied research. Part of that research, we are trying to understand properties of different enzymes. For example, T4 DNA Ligase. As part of that particular project, we are trying to understand mismatched ligations of overhangs that can be carried out by T4 DNA Ligase. There was designed a large experimental study, and from that experimental study, we got a lot of data on biases and preferences of T4 DNA Ligase in ligating those overhangs.

As a scientist, we became interested. Sometimes we have a particular pair of overhangs we are trying to understand how these overhangs are going to be treated by T4 DNA Ligase. Originally, we had a large spreadsheet, and we would go and check every individual overhang, but bear in mind that is somewhat tedious. We created a tool that you can provide a set of the overhangs and the program will automatically strike that information from our experimental data. We quickly realized the value of that because it's very easy allow us to see how good overhangs are. Are they compatible? Can they use in golden gate?

To summarize it, it comes from our internal research program where we're trying to understand properties of T4 DNA Ligase in the context of overhang ligation.

Lydia Morrison:
Yeah. I think that's a really powerful example of how the research, the basic scientific research that we're doing at New England Biolabs can produce these large datasets that can really help our customers inform decisions about what overhangs are best for them, or maybe what primers are best for them, or what enzymes are best for their experiments. So you talked about the NEBridge Ligase FIdelity Viewer, are there other tools in the NEBridge Ligase Fidelity toolset that you could tell us about?

Vladimir Potapov:
Yes. Actually, there are two complimentary tools. These two complimentary tools address a slightly different aspect of the golden gate assembly workflow. There are a lot of people at NEB and people outside that they have modular parts that they would like to assemble together. What they need to know which particular overhangs will be good to bring those parts together. In that particular case, I say for example, "I need a set of the five overhangs which are optimal for creating that construct."

For this, we developed a tool which is called NEBridge GetSet Tool. Essentially, the user can say that's particular enzyme experimental conditions. For example, "I would like to have a set of five, or 10, or 15 overhangs." That tool will automatically pick the best set of overhangs for a given experimental condition.

The other aspect of that workflow. Sometimes people are working already with an existing, for example let's say large nucleotide sequence. They would like to find a way to split that large nucleotide sequence into a set of smaller fragments. They would like to find optimal points within that nucleotide sequence that can be used for generating overhangs. One of the nice features of the golden gate assembly, it's a scar-less assembly so you can reassemble your construct without leaving marks, but you need to find those optimal overhangs in the sequence. For this particular task, we developed NEBridge SplitSet Tool where the use can introduce a particular nucleotide sequence and indicate approximate regions where he would like to introduce the cut sites. The tool will automatically optimize the cut points based on the specified experimental conditions.

Lydia Morrison:
How do I see the alignment of these overhangs and these recommendations from the GetSet tool, how do I see that affecting the accuracy of the final assembly or the success of the final assembly?

Vladimir Potapov:
Yes. The overall goal, for example a technique like golden gate assembly, is to assemble a larger construct from smaller pieces. Ultimately, the success of the golden gate assembly depends on the set of the chosen overhangs. Those tools are designed to automate and help users to choose those overhangs based on the experimental that was derived during the course of the experimental series.

Lydia Morrison:
It really can take a lot of the trial and error of finding these optimal combinations of overhangs out of the experimental design for researchers?

Vladimir Potapov:
Yes, that's exactly the point. How to pick the optimal set in a user-friendly way.

Lydia Morrison:
You've mentioned a couple of tools in the NEBridge Ligase Fidelity toolset. Are there any other tools that are available within there?

Vladimir Potapov:
When will we need those tools for outside users, external users, actually there was a lot of interest. Many people, when they go and use our tools, they find because if you work with one sequence, two sequences. But really quickly, we started getting the request from people who would like to do it for a lot of sequences at once. We were thinking about how to enable those users to perform this task. For these, we developed a particular set of the tools which we call NEBridge SplitSet Lite API. Essentially, API is an application program interface, and it's a programmatic way for users to automate their tasks. The people who are familiar with the programming, they can use our API and do a batch analysis of their sequences.

There are people who don't want to use a programmatic access, but they still would like to analyze many sequences. For this particular situation, we developed a tool which is called NEBridge SplitSet Lite High Throughput, where the people can provide the list of the sequences in various formats. The tool has a nice graphical user interface that allows them to accomplish the same task without relying on a programming interface.

Also, as an option we provide the overhang optimizer code. That code was originally used in our internal research to develop all those tools. We also make it available to other people who would like to maybe take that code and run it internally or adjust to their own needs.

Lydia Morrison:
Wow, I actually had no idea that we share so much of the build information with our customers. We really put that API interface and their hands as well to allow them to use the tools to the best capability for their particular use case. How many sequences are we talking about when we're thinking about high throughput users?

Vladimir Potapov:
Well, it realistically can be hundred of thousand of sequences. It mainly depends on the time it takes to run the calculations. But for a set of about 100,000 sequences, that's within seconds to minutes.

Lydia Morrison:
Wow. A really powerful, fast tool to enable folks to quickly look through their data and see what's successful and what might improve outcomes for their experiments. What are some of the challenges that you've faced in developing these tools?

Vladimir Potapov:
As a bioinformatics scientist in the research department, we develop tools internally and externally to address specific problems. Usually, the way we develop the tools, to help conduct the research. When we conduct the research, the goal can be shifting because you're trying to develop a certain essay. When you start working on it, you realize that you need to do some improvements, adjustments. When you change the experiment, usually you have to adjust your analysis workflow.

Essentially, when you think about the resulting software, it's a history of following the research projects. Depending on how the research is going, it might be necessary to take several iterations through development and analysis pipeline.

Lydia Morrison:
How long does it take to develop these tools? How long did it take to develop the NEBridge Ligase Fidelity toolset?

Vladimir Potapov:
It's part of the longterm research program at NEB. The project has several stages. One stage is collecting the data. The second stage is to understanding how the data's organized and how to process the data. Depending on particular project, it can take from weeks to months.

Lydia Morrison:
That's actually a lot faster than I thought it would take. Probably the gathering of the empirical data is one of the longest factors, yes? Or is it figuring out how to sort through and organize the data to make it make sense?

Vladimir Potapov:
I would say it's always project dependent. Sometimes experimental part is a difficult part. Sometimes computational part is difficult part. But usually, it always involves several iterations. Once you start analyzing your data, you start to find something you didn't expect that prompts you to either change your experiment or change the analysis workflow.

Lydia Morrison:
I know that I've been using NEB's tools for a long time, starting from when I was in graduate school and I would consider the catalog a tool. I think with the advent of our online tools, starting with NEBioCalculator and the Enzyme Finder, all the way to the newest tools that we've released today, including the NEBridge Ligase Fidelity Tools, these are incredibly valuable assets to researchers designing and performing experiments. I just think a huge kudos should go out to your team for making all the tools from the inside research at NEB available to our customers in an open access sort of way.

Where we're not just sharing a magic black box where they're putting in their sequence, and their primers, and their annealing temperature, or whatever, and we're spitting out an answer from them. But having the API available, having the code available for individuals who have that skillset to be able to integrate it into their workflows and have it be the most powerful tool it can be in their hands, I just think it's pretty incredible work that you and your team do. I wanted to thank you so much for being here today to share the details of it with us.

Vladimir Potapov:
Thank you.

Lydia Morrison:
Thank you for joining us for this episode of the Lessons from Lab and Life Podcast. Please check out our show's transcript for helpful links from today's conversation. As always, we invite you to join us for our next episode when I'm joined by three incredible young scientists whose work was foundational to the development of RNA vaccines. Olubukola Abiona, Geoff Hutchinson, and Dr. Cynthia Ziwawo share what it was like to be on the front lines of the develop for the first COVID vaccine.

A step-by-step video is available to guide in the use of these tools.

NEB Podcast #73 - Interview with Vladimir Potapov: The NEBridge Ligase Fidelity Tool

Transcript

Choose your country

North America

Europe

Asia-Pacific

NEB Podcast #73 -
Interview with Vladimir Potapov: The NEBridge Ligase Fidelity Tool