Open Sourcing a Private Repo
Season 10, Episode 8 | October 30, 2024
Bekah and Dan discuss the considerations and steps involved in open sourcing a private repository, sharing insights from their own experience with the Virtual Coffee community docs.
Show Notes:
In Episode 8 of Season 10 of the Virtual Coffee Podcast, hosts Bekah and Dan delve into the intricate process of open sourcing a private repository. Drawing from their experience with Virtual Coffee's community docs, they discuss considerations like verifying the safety of repository content, ensuring appropriate permissions, and maintaining the privacy of sensitive discussions. They also touch on the importance of updating READEMEs, licensing, and providing clear guidelines for new contributors.
Sponsor Virtual Coffee!
Your support is incredibly valuable to us. Direct financial support will help us to continue serving the Virtual Coffee community.
Please visit our sponsorship page on GitHub for more information - you can even sponsor an episode of the podcast!
Virtual Coffee:
- Virtual Coffee: virtualcoffee.io
- Podcast Contact: podcast@virtualcoffee.io
- Bekah: dev.to/bekahhw, Twitter: https://twitter.com/bekahhw, Instagram: bekahhw
- Dan: dtott.com, Twitter: @danieltott
Transcript:
- Bekah:
Hello, and welcome to Season 10. This is Episode 8 of the Virtual Coffee Podcast. Virtual Coffee is an intimate community of people at all stages of their tech journey, and we're here to share insights, experiences, and lessons learned along the way. I'm Bekah, and I'm here today with my co host, Dan.
- Dan:
Thanks, Bekah. This season, we're pulling from our back pocket topics from our Tuesday Virtual Coffees. Today's topic is open sourcing a private repo. We are grateful to be sponsored by Level Up Financial Planning. Level Up Financial Planning helps you take your financial confidence to the next level. It's real financial planning to help you reach your goals and gain clarity on what actions you need to take now to maximize your tech career. Level Up even has a podcast where you can hear about some of the strategies he uses with his clients. Check out levelupfinancialplanning. com and you can get that link in our show notes.
- Bekah:
Alright, so it's time to grab your favorite drink, settle in, and let's get started with today's Virtual Coffee podcast. We hope you enjoy this episode.
- Dan:
Yo, wuddup Bek,? How's it going?
- Bekah:
Hey, it's going okay. How's it going with you?
- Dan:
I'm doing okay. Um, today we have a pretty interesting topic. It's about what do you do when you are, when you have a private repository, uh, and you want to make it open source. And so there's a lot of things I think involved in that. Um, and we actually just did this with Virtual Coffee.
- Bekah:
We did. This year we have Open sourced our Virtual Coffee community docs. And so, we have a blog post that we'll link in the show notes about this, but one of the reasons why we decided to open source it was we have kind of perfected, I would say for us, the processes that we use to keep Virtual Coffee running, to expand, to support our volunteers, and we felt that we were at a place where we could share it with others and that they might be able to benefit from the resource as well. So that's one of the reasons why people sometimes open source their repositories. It's more of a resource repository for us. We don't really expect to have users in the typical way that somebody might use software, but we can talk about that today too.
- Dan:
Yeah, I think it's, is a, it was a pretty cool thing. And it's just for a little more context, this, um, VC community docs is their repository and we had this private for, uh, years, you know, for a long time and we used it internally for, Putting together scripts for people, organizing our different groups, all sorts of things like that. And yeah, at some point we realized, and it was a while ago I think, but we realized there's no real reason to have it be private and it could be useful for other people. Um, and, uh, and it did take, it wasn't just a switch of the button. I mean, you can, any repository on GitHub, you can just go into settings and say, hey, this is public now, right? But we, there was like a bunch of work that we wanted to do before we did that. Um, So I think, I think you can break this up into, like a few different areas of things you want to think about, right? If you're, if you're at this spot and you have a project that you want to make open source, so there's one is like, the content of the. I don't want to say content because it's not always just words, but like the stuff that's in the repository itself, right? Like, is this all safe to release? Um, is it all, are you allowed to release it, right? So, uh, if it's something that you just wrote yourself, I mean, you know, you are going to own all of it, right? But if it's for work or maybe you borrowed some stuff from, you know, You know, from work or from other private things, right? You want to make sure that both you have permission and that it's appropriate to make public, right? And that's kind of, I think, the first level of thinking with this sort of thing, right? That's the kind of thing you think about first. Um,
- Bekah:
Yeah, and you also want to think about the types of interactions and conversations that you've had in that repository, and is it okay for people to be able to read those things? You know, your history is all there in that repository, so have you said something or had a discussion that is meant to remain private forever? then you might need to think about like, okay, what, what makes sense here? Like, how can we proceed with this? Or is there information that, you know, maybe we've walked back on or we used to do something in a certain way and we changed it and there was a conversation that spurred that change, but it's not reflected here. And so sometimes like not all of the thought process or the growth process is captured and you want to make sure that you are keeping, um, I guess like the people who are involved safe, but also the understanding of where you came from when you made these decisions.
- Dan:
yeah. So when you're talking about that, what kind of things, like, what kinds of things are you looking at to, to kind of scan through?
- Bekah:
Yeah, so, uh, for instance, you might have a discussion board there. And for us, we have, um, a code of conduct at Virtual Coffee, for example. So if someone submitted a Code of Conduct violation form, did it go to that repository? Because that's something that we definitely want to keep private to respect the person who submitted it, also the person who might have been reported. Now for us, our Code of Conduct violation forms do not get sent to that repository. Um, it goes somewhere else and discussions happen there. And so that was like something that we don't have to think about. Um, but another example might be, um, you know, for instance, we used to, uh, our Lunch and Learns previously were called brown bag events. And so somebody brought up that there were some racial connotations to that that were really negative, and we didn't want that represented in our community. Um, but there is like a decision that had to be made there. Uh, and I think that it's important to, you know, convey decisions like that or to show like, okay, there, there was a thought process here. We weren't, we were unaware of the origins of that term to begin with and then we changed it here. So, you know, depending on how that conversation goes, uh, when the conversation was had, It was under the premise that this was a private repository and it wouldn't be shared with a wider audience. So, you know, were we respectful in those conversations? Were we open and, and okay with the way that things were being shared? And ultimately the answer for that is like, yes, we were. Um, but you know, I imagine in some circumstances when something's private, your communication with other people, whether in a PR review or in an issue, um, you might speak differently than if you knew you were having a private conversation.
- Dan:
Yeah, totally. And along with that, there, there could be, uh, you know, sensitive information really, uh, in, in any of those comments. So that's something to think about if you're thinking about open sourcing a project that has had, you know, activity like this, you know, where, where a lot of people are even just like one other person, really, but, uh, when it's not just you, right. Cause if it's just you and you have years of history of arguing with yourself and issue comment, you know, issue comments and things like that, um, that might be kind of a different problem. Yeah. uh, but no, but like that's the kind of thing to think about too, right? So, so there's all this history and GitHub does a very good job. And I, I, you know, value this that they do, but they keep, you know, they keep everything forever unless you tell them not to, and that can be, you know, so. If it's a, for instance, if it's a work repository and you, and you think it could be valuable for open source, um, that's something to think about too, is going through the issues and, and pull request comments. And, you know, if there's discussions or any of that, there can be comments on lots of different places, uh, in a repository. And so making sure, um, making sure that. There's nothing sensitive really that, that got through, you know, not just like the, uh, privacy, you know, or like privacy stuff that we talked about, but yet, I mean, maybe you paste it in a, uh, you know, like a API key or something that gives you, you're like, this is a private repo and nobody cares, you know, and, um, That's it's still there. Um,
- Bekah:
I've known, I've known projects that have done that, you know, they like, well, we have a small group of people, so I just put it in the readme so everybody has access to it, right? Um, that, that's going to be an issue if you decide to open
- Dan:
right. Yeah. And, uh, you know, along the same lines, um, the code history is also there. So if you had hard coded an API key or something like that in the past, um, It's still, you know, and then you removed it and put it in an environment variable or something like that. It is still there in the history. And there's ways to get around that. There's ways to delete it. There's lots of different options. Um, you can, you can use git, like the tool git to remove that sense of information from your repository while keeping the history. Because I think the history can be like really, really valuable. Um, so, you know, I mean, You could easily get around a lot of these like historical problems by copying the code into a new repository, you know, without any of the Git history and, you know, without all the other history. And that could work, you know, uh, and, and if there's like tons and tons of the kind of stuff that we're talking about, that might be a good option, honestly. Um, it's just like start over, copy and paste the code in, not the Git history, um, brand new repository, open source that, you know, um, and so that's, that's like one way to get to, to ensure that you don't have any of this historical. information that you don't want actually to be public, right? Um, but I think if you can avoid that, it's, it's, you know, it can be very valuable to have that historical stuff. You know, I love seeing how projects can change over the years, um, and the way you do that is like looking in the git history, but thick history is full of, um, secrets, other private information that we don't want out, then, you know, You got to make your decision. Uh, so there's ways to get around that either way. Definitely something to think about because, uh, it might not be, you know, it might not be your first consideration with this sort of thing, but I think it's a important little bit to, to do.
- Bekah:
Yeah. Yeah. And another part of that that you have to consider before open sourcing something is Now this everyone has access to this repository as well. So what do you need to do to prepare for that? And I know I think Cassidy Williams startup just open sourced their whole project. I think it's called brain story I'll try and make sure we link to it in the chat. Um, and so this is Uh, definitely something that is a code based project, right? And so, when you're thinking about that, then what do you need to start with? Like, how do you prepare the repository? And I think one of the first things, and one of the first things I think we had to add for our community docs was a license. We didn't have one, because we didn't really need one, right? And so then we had to determine, like, okay, how do we want people to be using this? Or like, what are we okay, how are we okay with people using it? using this information because, hey, they might go and start their own Virtual Coffee. And so that's like part of the conversation. Are we okay with somebody taking us exactly as we are and starting their own? You know, came down to it like, yeah, you know, because we don't provide for the needs of everyone, and maybe people want something different. Maybe they want a Python Virtual Coffee, or a WordPress Virtual Coffee. Um, but, what As long, uh, basically, you know, as long as there's attribution, I think that we felt pretty okay with adding a license that gave them some room there. And so we ended up going with a Creative Commons attributions license, um, that does give people a little bit more room to use it. And I think, I feel like a lot of projects that have, that are more tech space like ours, or maybe like in the creative educational aspect of things, tend to lean towards using the Creative Commons license.
- Dan:
Yeah. I mean, you'll, you'll hear people talk about that a lot. We have had, oh man, was it a lunch and learn
- Bekah:
I think it was a podcast episode last year,
- Dan:
episode talking about, uh, talking about different open source licenses and you can get really deep into it. Um, and there's a lot of differences. I think if you're at, um, If you're talking about a work situation, um, I would, maybe have, if you work as lawyers, maybe like, you know, have them double check things. But, uh, those licenses are, are there and GitHub actually has a really good, tool to help choose the right license. Um, so GitHub itself has a lot of information on this and there's a million other places you can learn about open source licenses as well. But yeah, I mean, you don't probably want to open source something unless you're hoping people will. You know, I mean, that's part of the trade of open sourcing, right, is, you know, you hope maybe people will help you submit issues, help develop things like that. But in return, you know, you're providing this software for other people to fork and do their own things with. Um, and, What you say in your license, like kind of defines what they're allowed to do. So something with a lot of content, like our stuff, you know, creative commons makes sense because they can, you know, what is it? They could take some of the stuff with, with attribution. They can like modify it. It's like, it's, it's very much a, it's a very good license for a repository that has a lot of like written word content, not just the, not just code, you know, um, and the content is part of the value of the repository. So some repositories are just like code. There's some content in docs and stuff, but that's not like the valuable part of the, of the repository. Right. In our, in our case, it was, you know, so,
- Bekah:
another part too of thinking about why you're open sourcing it. So for us, we want to continue to maintain this project. There are projects out there that open source because they have users, but they, you know, maybe they're a startup and they're shutting down, or they have decided not to maintain this project anymore. So as part of that, they kind of open source it for the users to say like, Hey, we're We're not going to be actively maintaining this, but if you want it, you can fork it, and then you can maintain this project. And so then that comes into consideration for their license, but also, then they need to update the README to make it really clear. Like, this project is no longer being maintained, so there are, I think you can actually turn off issues and discussions. I'm not sure if you can turn off PRs. Um, but then, then it provides clarity for anybody that comes across it or users that see it like, okay, this is here for you, but we're not going to be making any changes for us. Like we're accepting PRs. And so, you know, we actually have, uh, at least a couple there. So a friend of ours, um, Jason Torres, who's a member of Virtual Coffee, but also runs the tech commute community. went through the repository after we open sourced it just to kind of give it an overview like it makes sense to us because we've been in there for the last five years, but Jason has never seen it before. So, you know, he's giving some feedback about, okay, this is, these are, um, it might make sense to reorganize it in this way, because also now part of the consideration and some of the changes that we made initially, all the information targeted people who are familiar with Virtual Coffee, all of the things that we do. And just needed a process reference guide. But if you're coming from an external community coming in, does it make sense to you? How can you, are you going to be able to use this? Or have we talked about it in a way that's too specific to Virtual Coffee? And so, you know, like getting that second, um, opinion kind of, uh, is really useful and being able to, um, allow the repository to be helpful to others as well.
- Dan:
Yeah, totally. You know, and that, that kind of, that context switching is, is important if you want, if you want people to engage with your, with your, you know, project. Um, and so you can do a lot of work on the README with that, you know, the contributing guide as well. Um, and. You know, if it's code comments, if it's something, you know, if it's something that you've just done to yourself for a couple of years, it could be the same thing. So you can run into the same problem with code or content, you know, where it's like, I know what's going on here. Cause I wrote it in my head. You know what I mean? Like, you know, like, so I know the connections that are happening here. Um, Maybe it's a good time to go add some more comments that, like, explain a thing or two, you know, explain what's going on, um, and that's, you know, it can be the kind of, like, we're really lucky that he came and did that. It can be nice to have somebody else do that sort of thing because, you know, Like you said, they see the gaps, you know, like you could try to like work some assumptions about the stuff that you've been looking at for, you know, for years, but, um, somebody coming in totally blank, we'll have a much better insight onto what somebody coming in blank is feeling, you know, at any given moment. So, um, I think that's cool. And it's really like nice favor. And if you can find somebody, I think it's fine to ask somebody for that kind of help, you know what I mean? Um, like, Hey, can you, before I open source this, can you just like, You know, take this down, like try to run it or whatever it is, you know, um, and give me a heads up on like, does the readme work? Like, you know, like if you follow the steps without having my computer, you know, uh, does this work for you? Um, et cetera. So that's one, one of our, um, IU, or one of our members is like very active contributor and is on a Windows machine. And She finds Windows issues all the time, right? And, uh, which is good. And I don't have like, I mean, I could, you know, uh, probably try to find a Windows computer somewhere and run it before I, you know, publish it, but it's not like the website doesn't work on her machine. It's, you know, the development stuff. And so that's the kind of thing that potential contributor can help with, um, finding those kinds of issues. Uh, and those readmes, I mean, we talk about them all the time, but those readmes are so hard to keep up to date and keep, you know, to have them make sense for somebody whose brain isn't already in the project, all that stuff. So really good time, obviously, if you're about to open source a project, really good time to take a hard look at your, your readme and your contributing guide and all that stuff. Anything that you want people to know or do a certain way. and, uh, you know, you can't make too many assumptions about people coming into the project. So if you've done every single pull request. In exact, this exact certain way, and you want them to all be that way, you need to make sure that's written down somewhere because nobody's going to go through. I mean, like, yeah, we tell people all the time, yeah, go like peruse the old issues, go peruse some like completable requests, but like, they're not going to follow a pattern necessarily unless. You have explicitly stated it. So think about that kind of thing too. Um, how do I want this project to work with, you know, with outside contributors? And try to get as much as you can down in writing so that there's no surprises and you don't have to explain yourself a bunch of times, uh, or just get mad or whatever, you know.
- Bekah:
and that was like part of one of the things that I did when I was going through and auditing the repository too. Uh, and one of the things that I forgot about was like, okay, issue and pull request templates. Like, we didn't really need those before because it was only select members that got added to the repository. And we had some very specific issue templates for things like, lightning talks, right? And that's useful to us personally, but what if somebody comes in and has an issue about the text content and they need to make an update because it's outdated or whatever? There wasn't an issue form that was set up for that at the time. And I don't think that there was a PR template there might've been, but it was like just really basic. And so then I just went and copied the ones that we were using on our, um, Virtual Coffee. io repository just to keep it standard. And then I brought those over because you also have to think like, okay, what do the external contributors need to be set up for success? And how do we set up ourselves for success as maintainers? Because you have to think like. You're going from not, from knowing every single one of your contributors very well, to potentially not knowing them at all. So how do you kind of bridge that gap and make it okay for everyone?
- Dan:
Yeah, totally. Another smaller thing to think about is any connections to projects that you don't want to be open sourced, right? If you have like multiple connected projects and you're only open sourcing a small part of it, you know, you need to be aware that, uh, of how those things work too, right? And so this goes like, kind of touches back to all of the points we've already made, you know, so there's like, okay, do we, are we exposing sensitive information? Um, do people know how to do whatever we're trying to do, you know? And so if you can, so like, I've done this before where I had, um, a little like JavaScript thing, and I used it on two different internal projects. And I'm like, it was just, I was just copying and pasting, you know what I mean? And I'm like, well, this could just be a thing, you know, that we can open source. Right. But I needed to make sure it wasn't like too specific to the case and that it wasn't representing, you know, referencing Like our clients name anywhere, right. For instance, you know, um, cause we had, it was the same client, multiple projects, so it was like, yeah, their name was in there, but if I want to open source this thing, that could be useful, um, for anybody, um, need to make sure we, we like, it's not, you know, it's not like reading from our internal packages and it's not like doing other things, you know, so that was, uh, and that was like, For a, for a little project like that, that can be kind of fun too. And so, I mean, it's like a little bit different. It's not a, it's, I guess that's more of creating an open source project than, than, you know, open sourcing a private one, but like it was like private existing code, you know? So.
- Bekah:
No, and I think that's important. I think that is also part of open sourcing a project, because you might find that like, Well, what if I want to open source 90 percent of this, but there's 10 percent of this that I can't open source, then what do I do? Do I just keep it private, or can I, like, abstract that out into another repository and make that private, and then make sure that I take care of it in a way that, that it's removed from this project altogether?
- Dan:
Yeah, and that could be a fun like, honestly, a fun code exercise as well, you know, is trying to do that and making your making your project a little more configurable and things like that. And that can be true with content and stuff, too. You know, it's not, it's not always just code, right? But if there was some stuff in here, it's like you talked about the, the uh, Uh, Code of Conduct violations, right? If that was something that was coming in here, yeah, we need to, like, move those somewhere else, right? We still need to reference them. Like, they still need to be a thing that exists, the Code of Conduct, and like what to do if you, you know, like, if you have a Code of Conduct problem. We don't want the actual reports coming into the, uh, The public one. And so that's like that kind of, that kind of thing can, like, if you run into, Oh man, this part of this project absolutely cannot get, you know, be open sourced, I think there's lots of, lots of solutions to open sourcing the rest of it, you know, just kind of like you said, like abstracts, you know, cut a chunk of it out and then make that private and then, um, you know, make sure you update the history and stuff, I guess,
- Bekah:
Alright, we've got a couple of minutes left, so are there any other considerations or tips that you have for listeners who are considering open sourcing a project?
- Dan:
no, I mean, I think, you know, I mean, I keep thinking about like, okay, what, what do I do if I'm If I'm going to be doing this and with my context, lots of times I'm thinking about those, like the small packages, that's the kind of thing that I have personally, like open sourced myself. Usually it's like small utilities that help me in some way. and so I, like the first thing I would do personally is get it up on open source and maybe not announce it, but try to use it, um, with my own stuff, you know, like point to the new one or whatever it is, uh, like use it as if you were a, Potential contributor or potential like consumer of the thing. Um, and make sure you don't have like the keys, you know, like lots of times you have, uh, like my GitHub key is in a SSH thing, you know, like I have all this stuff set up on my, on my computer, like in terminal and things like that. And nobody else does except for me. Right. It's like my personal stuff, but it needs to work if I don't have that stuff, you know, and so finding ways to, blog out and not use any of your current setups and start from scratch, you know, try to use it, see if, make sure it works, make sure like the NPM, if you're doing a, like publishing it somewhere, make sure that is working. and I, I mean, I think that's pretty much it, you know, and then if there's like, if it's the kind of project where you're like, You know, you want to improve. I think that's another, another piece is like filling out some issues, uh, for people to work on. If, if you're like this project could be cooler and I have some ideas, I don't have time to work on it or whatever, you know, um, like, I think that kind of thing can be very cool too, is like. Announcing a new project. Hey, here's my project. Like, yes, use it. But also if you want to contribute, like, I got like 20 issues right here, you know, you know what I mean? Um, and I'm looking for help, you know, like that kind of thing can be cool too. Cause especially if it's people, you know, might be excited about just kind of jumping on to help out a little bit. Um, so I think, uh, providing, providing, uh, opportunities for people to contribute as well as, you know, use the thing in whichever way can be a neat way to do it too.
- Bekah:
I think that's a great tip. And I think that's a great way to end this episode. So thanks everybody for listening to our episode on open sourcing a private project. And we'll catch you later.
- Dan:
All right. Thanks everyone. Thank you so much for listening to this episode of the Virtual Coffee Podcast. This episode was produced by Dan Ott and Bekah Hawrot Weigel, if you have questions or comments, you can hit us up on Twitter @VirtualCoffeeIO or email us at podcast@virtualcoffee.io. You can find the show notes, sign up for the newsletter, buy some VC merch, and check out all of our other resources on our website, virtualcoffee.io. If you're interested in sponsoring Virtual Coffee, you can find out more information on our website at virtualcoffee.io/sponsorship. Please subscribe to our podcast and be sure to leave us a review. Thanks for listening and we'll see you next week.
The Virtual Coffee Podcast is produced by Dan Ott and Bekah Hawrot Weigel and edited by Dan Ott.