Jupyter4NFDI Survey Summary
Jupyter4NFDI Survey
The Jupyter4NFDI Survey was designed and conducted to gauge the current state of Jupyter usage within the NFDI consortia and to gather feedback on the Jupyter4NFDI project. In total, 75 people from 53 German research institutions participated in the user survey between Nov. 28th 2024 and Jan. 6th 2025. This report provides an overview of the survey results, with respect to the participants, their current usage of Jupyter, and their expectations from the Jupyter4NFDI project. The results contribute to developing the Jupyter4NFDI infrastructures in accordance with the requirements and expectations of the NFDI community. They will also enable the Jupyter4NFDI consortium to consolidate efforts and resources within the NFDI and, if applicable, outside collaborators.
For any questions, feedback, or comments, please contact the Jupyter4NFDI consortium here
To check out the detailed results, you can just click on the Binder button below. This will open an interactive Jupyter Notebook where you can explore the data in more detail. We have provided the code for you to run the descriptive analysis but you can also explore the data yourself!
Questionnaire
Below you can find a diagram that showcases the questions asked in the survey and the branching of the survey based on participants responses.
flowchart TD A1[Name of your organisation?] --> A2[Which NFDI consortium are you connected to?] A2[Which NFDI consortium are you connected to?] --> A3[Describe your role in the NFDI consortia] A3[Describe your role in the NFDI consortia] --> A4[Do you already know Jupyter4NFDI?] A4 --> A6[What do you expect from Jupyter4NFDI?] A6 --> A7[Where, and how you are using Jupyter?] A7 --> A8[Select the appropriate branch that describes your activity.] A8 --> B4[Other] B4 --> F1 A8 --> B1[User] A8 --> B2[Representative/Manager] A8 --> B3[Resource Provider] B3 --> D1[Are you responsible for the operation/management of an infrastructure and/or service within the NFDI consortia?] D1 --> D2[Does your institute/center have its own IaaS cloud/cluster?] D2 -->|Yes| D3[On which basis do you run your cloud/cluster or shared resources?] D2 -->|No| D4[Do you already operate a JupyterHub or similar service?] D3 --> D4 D4 -->|JupyterHub| D6[Can you provide a link to this service?] D4 -->|Similar Service| D5[Which similar service do you use instead?] D6 --> D7[Which spawner do you use?] D4 -->|No| D8[Are you willing to connect shared resources to the central JupyterHub?] D5 --> D8 D7 --> D8 D8 --> |Yes| D11[How many shared resources would you offer - CPUs, RAM, GPUs, storage?] D8 --> |No| G1 D14 --> G1 D8 --> |Only specified Users| D9[Which users would be eligible to use your resources?] D9 --> D10[Can you specify these user group within the NFDI?] D10 --> D12[Are there any specific policies attached to the usage? Please provide links, if available.] D12 --> D13[What benefits do you expect from connecting your resources?] D11 --> D12 B2 --> E1[Do you know about Jupyter?] E1 --> E2[How large is the group which you represent? Please indicate the number of people.] E2 --> E3[What benefits do you expect from a centralised NFDI Jupyter service for the group that you are representing?] E3 --> E4[Do you have dedicated partners offering Jupyter services?] E4 -->|Yes|E5[Can you say which institution and/or person is responsible for the operation?] E4 -->|No|G1 E4 -->|I don't know|G1 B1 --> F1[For what purpose do you use Jupyter?] F1 --> F2[How are you currently using Jupyter services?] F2 -->|Yes| F3[Who's the provider of the JupyterHub?] F2 --> F4[What are your resource requirements?] F3 --> F4 F4 --> F5[What are your environment requirements?] F5 --> F5b[Do you have any other resource or environment requirements?] F5b --> F6[Do you require access to data outside of the notebook?] F6 -->|Yes| F7[Do you need write access to shared data?] F6 -->|No| F9 F6 -->|I don't know| F9 F7 --> F8[Which other external data sources do you need?] F8 --> F9[Would you like to offer software or services through the platform?] F9 -->|Yes| F10[Can you briefly elaborate which software or services you would offer via the platform?] F9 -->|No| F11[Do you know about Binder?] F10 --> F11 F11 --> F12[Do you need reproducibility from Git or data repo - binder-like functionality / FAIR digital objects] F12 --> F13[Do you know about JupyterLite?] F13 --> F14[Do you know about Google Colab?] F14 --> F15[What advantages would you expect from using the Jupyter4NFDI service compared to the services mentioned before - Binder, JupyterLite, and Google Colab?] F15 --> F16[What do you think might be missing in the Jupyter4NFDI service?] F16 --> F17[One can run various backends behind a JupyterHub proxy. What other services would you be interested in?] F17 --> G1[Can we contact you in the future if we have further questions or would like to send you more information?] E5 --> G1 D13 --> G1 G1 -->|Yes|G2[What is your first name?] G2 --> G3[What is your last name?] G3 --> G4[What is your email address?] G4--> G5[Would you be willing to participate in a user study?] G5--> G6[Would you like to provide additional relevant information to the Jupyter4NFDI service team that was not asked in the survey?] %% Styling Nodes with fill colors %% Nodes in #cce7f9: B1, B4, and F1 to F17 style B1 fill:#cce7f9,stroke:#333,stroke-width:1px style B4 fill:#cce7f9,stroke:#333,stroke-width:1px style F1 fill:#cce7f9,stroke:#333,stroke-width:1px style F2 fill:#cce7f9,stroke:#333,stroke-width:1px style F3 fill:#cce7f9,stroke:#333,stroke-width:1px style F4 fill:#cce7f9,stroke:#333,stroke-width:1px style F5 fill:#cce7f9,stroke:#333,stroke-width:1px style F5b fill:#cce7f9,stroke:#333,stroke-width:1px style F6 fill:#cce7f9,stroke:#333,stroke-width:1px style F7 fill:#cce7f9,stroke:#333,stroke-width:1px style F8 fill:#cce7f9,stroke:#333,stroke-width:1px style F9 fill:#cce7f9,stroke:#333,stroke-width:1px style F10 fill:#cce7f9,stroke:#333,stroke-width:1px style F11 fill:#cce7f9,stroke:#333,stroke-width:1px style F12 fill:#cce7f9,stroke:#333,stroke-width:1px style F13 fill:#cce7f9,stroke:#333,stroke-width:1px style F14 fill:#cce7f9,stroke:#333,stroke-width:1px style F15 fill:#cce7f9,stroke:#333,stroke-width:1px style F16 fill:#cce7f9,stroke:#333,stroke-width:1px style F17 fill:#cce7f9,stroke:#333,stroke-width:1px %% Nodes in #ffe8c4: B2, E1, E2, E3, E4, E5 style B2 fill:#ffe8c4,stroke:#333,stroke-width:1px style E1 fill:#ffe8c4,stroke:#333,stroke-width:1px style E2 fill:#ffe8c4,stroke:#333,stroke-width:1px style E3 fill:#ffe8c4,stroke:#333,stroke-width:1px style E4 fill:#ffe8c4,stroke:#333,stroke-width:1px style E5 fill:#ffe8c4,stroke:#333,stroke-width:1px %% Nodes in #ccecd8: B3, D1 to D13 style B3 fill:#ccecd8,stroke:#333,stroke-width:1px style D1 fill:#ccecd8,stroke:#333,stroke-width:1px style D2 fill:#ccecd8,stroke:#333,stroke-width:1px style D3 fill:#ccecd8,stroke:#333,stroke-width:1px style D4 fill:#ccecd8,stroke:#333,stroke-width:1px style D5 fill:#ccecd8,stroke:#333,stroke-width:1px style D6 fill:#ccecd8,stroke:#333,stroke-width:1px style D7 fill:#ccecd8,stroke:#333,stroke-width:1px style D8 fill:#ccecd8,stroke:#333,stroke-width:1px style D9 fill:#ccecd8,stroke:#333,stroke-width:1px style D10 fill:#ccecd8,stroke:#333,stroke-width:1px style D11 fill:#ccecd8,stroke:#333,stroke-width:1px style D12 fill:#ccecd8,stroke:#333,stroke-width:1px style D13 fill:#ccecd8,stroke:#333,stroke-width:1px
Participants
In total, 75 people from 53 German research institutions participated in the user survey between Nov. 28th 2024 and Jan. 6th 2025. Participants institutions were mostly universities and research institutions all across Germany.
In terms of NFDI consortia, survey participants came from 27 different consortia but mostly from consortia with a focus on natural sciences and computational methods.
Within their repsective consortia, participants fulfill various roles, but mostly management and leadership roles, followed by IT- and infrastructure development, and roles as researchers and scientists.
Interestingly, only about half of all participants already know Jupyter4NFDI, despite their backgrounds in quantitative research and computational methods and their membership in other NFDI consortia.
In terms of expectations, participants were mostly interested in easy access, a single entry point and training.
Value | |
---|---|
expect_easy_access | 53 |
expect_single_entrypoint | 40 |
expect_training | 28 |
expect_persistant_storage | 27 |
expect_easy_FAIR_objects | 26 |
expect_tech_support | 24 |
expect_X_consortium_collab | 22 |
expect_dont_know | 9 |
With respect to Jupyter usage, most participants classified themselves as users of Jupyter, followed by the role of resource provider and representative/manager.
Var1 | Freq |
---|---|
Other | 3 |
Representative/Manager | 10 |
Resource Provider | 17 |
User | 38 |
NA | 7 |
Branch Results
Depending on their role for using Jupyter, the survey was branched with different participants answering different questions tailored to their roles. The following tabsets thus only describe the respective subsets of participants that were shown the questions correspdoning to their role.
Survey participants in the user role where predominantly using Jupyter whenever possible or specifically for workshops and training.
Most of the participants in the user role were using Jupyter through Jupyterlite or through Google Colab.
Participant in the user role using Jupyter through external providers were using a variety of different providers.
For most participants in the user role, all asked for requirement are important:
- Concurrent Session Average
- Concurrent Session Peak
- User Number
- GPUs per Session
- CPUs per Session
- RAM per Session
- Persistent Storage per Session
Similarly, all asked for environment requirement were important for most users:
- Lab extensions
- Software
- Licenses
- Custom Images
Most participants in the user role indicate to need write and read access to external data sources in the jupyter notebooks
Among the users, the majority is not sure yet whether they want to offer software and services through jupyter in the future. Among those that do know, about half plan to do so while half plan not to do so.
Among Jupyter users in our sample, most have never head of binder
The majority of the sampled users indicates to need reproducibility from Git or similar systems in their notebooks. Some users are still unsure and only very few outright say that they do not need these features.
The majority of users in the sample indicate that they have never heard of jupyter lite but already use Google Colab.
With respect to other backends running on a JupyterHub proxy, most surveyed users are neither interested in in RStudio Server nor in VScode.
Participants in the resource provioder role were mostly not personally responsible for infrastructures and services
About 50% of participants in the resource provider role had their own IaaS cluster at their institutions.
Of those participants that do have their own IaaS cloud, most of them run them in a bare matal configuration or on Kubernetes.
Most participants in the resource provider role already operate a JupyterHub, following by not using a JupyterHub or similar service. Only two participants indicated to operate a similar service that is not JupyterHub.
For those participants that do operate a JupyterHub, only dew services are available publicly, most of them are in development or for internal use only
For those participants that do operate a JupyterHub, most use a KubeSpawner
For those participants in the resource provider role, none are willing to connect shared resources to the central JupyterHub for an unspecified group of users and only few are willing to share resources with a specific group of users. Most are not willing to connect shared resources to the central JupyterHub at all.
For those participants that are willing to connect shared resources to the central JupyterHub for a specific group of users, all are willing for local users but only have are for NFDI-users or external users respectively
For those participants willing to connect shared resources to the central JupyterHub for a specific group of users, only 50% at most would expect any benefits from doing so.
Of the participants in the role of representatives or managers, all already know Jupyter.
The groups represented from participants in the role of representative or manager are relatively large (Mean = 44.71, Median = 30)
Of the participants in the role of representative or manager, half already have a dedicated partner offering Jupyter Services.
Further Contact
Most participants of the survey were interested in being contacted for further survey and in participanting in the user study.
Summary
Participants
In total, 75 people from 53 German research institutions participated in the user survey
survey participants came from 23 different consortia but mostly from consortia with a focus on natural sciences and computational methods.
Participants institutions were mostly universities and research institutions all across Germany.
Within their respective consortia, participants fulfill various roles, but mostly management and leadership roles, followed by IT- and infrastructure development, and roles as researchers and scientists.
Interestingly, only about half of all participants already know Jupyter4NFDI, despite their backgrounds in quantitative research and computational methods and their membership in other NFDI consortia.
In terms of expectations, participants were mostly interested in easy access, a single entry point and training.
With respect to Jupyter usage, most participants classified themselves as users of Jupyter, followed by the role of resource provider and representative/manager.
Branch Results
Survey participants in the user role where predominantly using Jupyter whenever possible or specifically for workshops and training.
Most of the participants in the user role were using Jupyter through Jupyterlite or through Google Colab.
Participant in the user role using Jupyter through external providers were using a variety of different providers.
For most participants in the user role, all asked for requirement are important: Concurrent Session Average, Concurrent Session Peak, User Number, GPUs per Session, CPUs per Session, RAM per Session, Persistent Storage per Session
Similarly, all asked for environment requirement were important for most users: Lab extensions, Software, Licenses, Custom Images
Most participants in the user role indicate to need write and read access to external data sources in the jupyter notebooks
Among Jupyter users in our sample, most have never head of binder
The majority of the sampled users indicates to need reproducibility from Git or similar systems in their notebooks. Some users are still unsure and only very few outright say that they do not need these features.
The majority of users in the sample indicate that they have never heard of jupyter lite but already use Google Colab.
With respect to other backends running on a JupyterHub proxy, most surveyed users are neither interested in in RStudio Server nor in VScode.
Participants in the resource provider role were mostly not personally responsible for infrastructures and services
About 50% of participants in the resource provider role had their own IaaS cluster at their institutions.
Of those participants that do have their own IaaS cloud, most of them run them in a bare metal configuration or on Kubernetes.
Most participants in the resource provider role already operate a JupyterHub, following by not using a JupyterHub or similar service. Only two participants indicated to operate a similar service that is not JupyterHub.
For those participants that do operate a JupyterHub, only few services are available publicly, most of them are in development or for internal use only
For those participants that do operate a JupyterHub, most use a KubeSpawner
For those participants in the resource provider role, none are willing to connect shared resources to the central JupyterHub for an unspecified group of users and only few are willing to share resources with a specific group of users. Most are not willing to connect shared resources to the central JupyterHub at all.
For those participants that are willing to connect shared resources to the central JupyterHub for a specific group of users, all are willing for local users but only half are for NFDI-users or external users respectively
For those participants willing to connect shared resources to the central JupyterHub for a specific group of users, only 50% at most would expect any benefits from doing so.
Of the participants in the role of representatives or managers, all already know Jupyter.
The groups represented from participants in the role of representative or manager are relatively large (Mean = 44.71, Median = 30)
Of the participants in the role of representative or manager, half already have a dedicated partner offering Jupyter Services.
Additional Information
For any questions, feedback, or comments, please contact the Jupyter4NFDI consortium here
To check out the detailed results, you can just click on the Binder button below. This will open an interactive Jupyter Notebook where you can explore the data in more detail. We have provided the code for you to run the descriptive analysis but you can also explore the data yourself!