1 Introduction

Children are frequent internet users (e.g., Danovitch, 2019; Šmahel et al., 2020) and schools are widely adopting internet-enabled educational technologies (e.g., Haßler et al., 2016; Mulet et al., 2019). However, the question remains whether children understand how the internet works. The internet is a complex amalgam of socio-technological artifacts, and as such, it can be understood from social (user) and technical perspectives. The social perspective refers to people’s behavior online, their user competencies, information literacy, technology-enhanced education and the like (e.g., see Chaudron et al., 2015; Finkelhor et al., 2021; Fraillon et al., 2020; Rozgonjuk et al., 2021; Šmahel et al., 2020). The technical perspective refers to understanding the causal fabric of the internet: its infrastructure, processes, and how they are intertwined.

This internet’s underlying fabric is mostly opaque: invisible for everyday users. Yet its understanding can help children to benefit from internet-enabled devices in technology-enhanced classes. Moreover, it can aid in avoiding online risks, especially because internet-related systems evolve dynamically: from personalized web services to the smart networks increasingly controlling our households. Having knowledge about how these systems work empowers a person to see “what will happen when” in relation to internet-related data and processes, such as how online information can and cannot be misused or how intruders can intercept communications. Can, for instance, smart cars or cloud-connected solar power plants be hacked and what consequences would that have? In short, technical understanding can help to answer the socially-relevant questions as well as to increase internet user competencies. As Yan (2006, 2009) repeatedly demonstrated: social understanding of the internet increases with technical understanding thereof; but not vice versa. It is the technical understanding that can help children make informed decisions concerning internet-related systems.

Research on children in the context of the internet tends to focus on social rather than technical perspectives (cf., e.g., Fraillon et al., 2020; Livingstone et al., 2019; Šmahel et al., 2020). Concerning the latter, Yan’s series of seminal studies (2005, 2006, 2009) examined the overall level of internet social and technical understanding among multiple age cohorts, pointing at surprising knowledge gaps even among adolescents. Understanding appears to increase with age, but remains mostly perception-bound: the studies suggest that understanding is based on personal experience with the digital world rather than on educational scaffolding (see also Bordoff & Yan, 2017 and Danovitch, 2019). Yan’s studies, though, did not report specific ideas children have about the internet and only a few other studies did so. For example, some children around ages 7–9 appear to think that the internet lurks inside their phone (e.g., Eskelä-Haapanen & Kiili, 2019; Mertala, 2019) and younger children are barely aware of what ‘online’ means (Chaudron et al., 2015). Some 13–14-year-olds believe that the internet’s content is stored on a central computer (e.g., Diethelm et al., 2012a). These studies map child knowledge in detail; however, they tend to focus on the internet from a specific angle while using a narrowly defined age group and often a small sample. Moreover, several studies were conducted in extant technological contexts: such as before wide availability of smartphones (e.g., Mumtaz, 2002; Papastergiou, 2005). Altogether, existing studies suggest that children’s knowledge about the underlying fabric of the internet appears to be limited and patchy at any age. However, literature on this topic is not well organized and is itself limited. Therefore, a systematic synthesis of previous findings can bring us closer to getting an overall picture of children’s knowledge of the internet’s technical side.

The present study fills this gap by presenting a systematic review that addresses the following research questions:

  • RQ1: What are the general conceptions about the internet that children 3–15 years of age have?

  • RQ2: What conceptions do children have about the internet infrastructure?

  • RQ3: What conceptions do children have about dataflow on the internet?

This review maps how child knowledge evolves from kindergarten (ISCED-02; age from ~ 3) to the end of lower secondary education (ISCED-2; age up to ~ 15). The lower end of the interval was chosen for the following reasons: a) children use digital technologies from a preschool age (e.g., Danovitch, 2019; Edwards et al., 2018); b) it has been argued that there is a need to start with cyber-safety education at a very young age, although it is not yet clear at what age to start building a technical understanding (cf., e.g., Edwards et al., 2018; Sprung et al., 2020); and c) studies that map internet-related understanding, such as understanding of internet-enabled toys, have recruited children as young as 3 years of age (Mertala, 2020). The upper end was chosen because this is often the end of compulsory schooling.

The review is guided by the following question: at what age do specific conceptions start to appear at the earliest? Bear in mind though that the absence of evidence is not evidence of absence: some conceptions may appear earlier than reported by the studies.

For terminological brevity, we use the term conceptions as units of understanding (what is in students’ minds); whereas, concepts refer to the real-world constituents (what is real). Conceptions may be acquired from various sources outside schools (these would typically be called pre-conceptions) or in schools (mature conceptions); they can be correct, partially correct or incorrect (misconceptions). Previous studies did not always examine the origin of children’s knowledge; therefore, this is not the main focus of the current review, although we comment on this whenever possible.

This synthesis will prove useful for educational designers and practitioners at the K-8 level. First, the use of the internet and internet-enabled technologies in the context of child education has increased dramatically (e.g., Mulet et al., 2019). Second, information technology curricula have been changing in many countries (e.g., Gal-Ezer & Stephenson, 2014; Hubwieser et al., 2015), and they should now address internet literacy too (e.g., CSTA, 2017). However, this endeavor is also nascent. When designing new learning materials and methods, knowing how children think about the respective educational target is a key initial step (i.e., it is crucial to activate prior knowledge to allow children to update their beliefs rather than creating a separate, and possibly contradictory, understanding of the concept; e.g., Diethelm et al., 2012b; Duit et al., 2012). Yet the recently-emerged educational materials and methods for teaching children about how the internet works (e.g., Czech TV, 2020; Internet-ABC, 2022; Liukas, 2018; Page et al., 2020) do not appear to capitalize on children’s prior ideas. Most related research on K-8 education focuses on teaching programming, while internet-related projects target higher education learners.Footnote 1 Therefore, this systematic review has value especially for educational stakeholders seeking to teach children a technical understanding of the internet in the most efficient manner or employ internet-based technologies in non-ICT subjects.

2 Theoretical framework

A theoretical framework allows one to understand and explain research findings. In this review, we use an explanatory framework from Edwards et al. (2018) that draws on Vygotsky’s ideas (1987). This is a constructivist framework of knowledge acquisition. Constructivist models are widely used in science education (e.g., diSessa, 2014; Mayer, 2021; Özdemir & Clark, 2007), including the literature on internet-related conceptions (e.g., Yan, 2009) and other computing education topics (e.g., Diethelm et al., 2012b; Lister, 2016; Sorva, 2012). Within these models, acquisition of normatively correct knowledge is an effort-intensive process that requires learners to construct new mental entities within their working memory and integrate them with more stable knowledge representations within the long-term memory. The integration process involves building new knowledge entities from, and on top of, prior knowledge. This process does not include replacement of existing knowledge entities with new ones: prior knowledge cannot be deleted, it remains in the long-term memory. However, the integration process can augment or alter it.

Within the Vygotskian constructivist framework, children start to develop conceptual understanding by building ‘everyday’ conceptions stemming from their daily practices and experiences (e.g., “Sometimes the videos on YouTube do not play.”). Later, they gradually enrich their ‘everyday’ knowledge base through ‘scientific’ explanations. This typically happens in schools. These ‘scientific’ explanations relate to how and why things work (e.g., “We need an internet connection in order to watch YouTube videos.”). During acquisition of ‘scientific’ conceptions, both ‘scientific’ and ‘everyday’ ideas are integrated in the long-term memory, forming ‘mature’ conceptions that have an explanatory power (e.g., “When YouTube isn’t working, I should try reconnecting my device to the internet.”). However, integration of ‘scientific’ conceptions can be challenging, especially when the new conceptions relate to things invisible to regular users (such as a server or network router). Thus, it can happen that some ‘scientific’ conceptions are not properly connected with previous knowledge. Children may still recall such unconnected ‘scientific’ ideas but only when appropriately cued and only as “empty facts” unrelated to their prior experiences. They may also forget them entirely. Consequently, especially among younger children, knowledge based on personal experiences can be more common than ‘scientific’ conceptions referring to invisible concepts.

3 Method

The research was carried out in the following phases: a) searching for the literature, b) coding of the included studies, and c) interpretation of the coding process results.

3.1 Search process, inclusion/exclusion criteria

As the initial step, articles were searched using various combinations of keywords such as ‘children’, ‘[pre|mis]conceptions’, ‘internet’, ‘understanding’ or ‘computing [education]’ in Google Scholar and the ACM Digital Library. However, this process yielded only a few canonical studies and many unrelated ones; hence this approach was deemed insufficient. Therefore, we resorted to additional snowball sampling (Wohlin, 2014). Snowball sampling is carried out in rounds. In each round, the references inside the studies are followed together with citations of these papers already found in the previous search round. The process continues until no new work is discovered. The following studies were included in the initial set: Diethelm et al. (2012a), Mertala (2019), Mumtaz (2002), and Yan (2005, 2006). They were identified by a keyword search of ‘children conception[s] internet understanding’. Subsequent citations were searched in the same databases (Google Scholar, ACM Digital LibraryFootnote 2).

Altogether, more than 2,000 references were inspected (the exact count is difficult to report due to nondeterminism of Google Scholar) by reading titles and, if appropriate, the abstracts. Sixty-eight studies were read in full. Of these, 27 satisfied the following inclusion criteria and were included in the review. The search process according to the PRISMA guidelines (Page et al., 2021) is depicted in Fig. 1.

Fig. 1
figure 1

Search process

We included studies that

  1. 1.

    primarily examined participants from pre-primary education (ISCED-02; age from ~ 3) to the end of lower secondary schooling (ISCED-2; age up to ~ 15 or Grade 8/9); some works examined border cases (e.g., Brinda et al., 2018: Grade 9–11) and these were included;

  2. 2.

    reported at least one research finding that can be defined as a participant conception about technical understanding of the internet;

  3. 3.

    were released by 30–06–2022.

No restrictions on research method were imposed: interview, focus group, drawings as well as survey studies were included. Likewise, all available types of work were included (i.e., conference and journal papers, student theses).

We excluded studies that

  1. 1.

    were not available (e.g., Luckin et al., 2001);

  2. 2.

    examined adults or higher education learners only (e.g., Dechand et al., 2019);

  3. 3.

    examined only internet user aspects, including internet usage/behavior studies (e.g., Chaudron et al., 2015; Johnson, 2010) and large-scale internet use surveys (e.g., Šmahel et al., 2020);

  4. 4.

    examined threat models (e.g., Zhang-Kennedy et al., 2016), data literacy and privacy risks studies (e.g., Zhao et al., 2019; Bowler et al., 2017; see Livingstone et al., 2019);

  5. 5.

    examined information behavior, including mental models of search engines or processes (e.g., Holman, 2011);

  6. 6.

    examined conceptions about individual computers or embedded devices only (e.g., Rücker & Pinkwart, 2018).

3.2 Coding of study variables

The following variables were coded for each study: the study’s country, participants’ ages (or grades: based on information available in the report), sample size, research method, and duration of the session.

3.3 Analysis and coding of conceptions

An inductive thematic analysis (Braun & Clarke, 2006) was used for coding conceptions in the studies. Thematic analysis is a qualitative process usually employed for coding interview transcripts or video recordings. It is an inductive process based on inferring the coding scheme in a bottom-up fashion by iterative reading of the data set. New codes are added until saturation; already existing codes can be renamed, merged, split and/or linked (i.e., eventually organized into a form of mind maps). The analysis was conducted in Atlas.ti 22 software consensually by the first three authors. The fourth author acted as an auditor (Hill, 2012; Hill et al., 1997). The coding procedure continued until a consensus among the first three authors was reached. The codes were subsequently reviewed by the auditor.

Because original data from the studies was not available, the data set included the text of studies where the following passages were coded:

  1. 1.

    Individual participant citations or drawings, when available (e.g., “[the internet is] a device operated by man”; Papastergiou, 2005, p. 348);

  2. 2.

    Summaries of findings as written by the authors (e.g., “Only 40% of our students talked about cable or wireless transmission. No student mentioned using a combination of these two.”; Diethelm et al., 2012a, p. 71).

When possible, it was examined whether the coded passage referred to one or few or a larger number of participants. In this way, it was possible to approximate whether or not the conception in question was rare in the respective study. This was crucial because qualitative studies rarely report exact frequencies, and a full-fledged frequency analysis cannot be conducted.

Our analysis was particularly interested in the content mentioned in curricular documents (e.g., CSTA, 2017; MEYS, 2021), including

  • general perception of the internet: how the internet is generally perceived;

  • internet infrastructure: knowledge of routers/switches and servers, communication media (Wi-Fi vs. cable vs. cellular data), internet architecture;

  • data transfer: awareness of how data travels across the internet and where it is stored; awareness of packets, data persistency, clouds, data encryption; knowledge of uploading, downloading, streaming; addressing.

4 Results

4.1 Studies

Twenty-seven studies were identified dating from the years 2002–2022 with the total sample being 2,214 participants between 3–17 years of age. We primarily focused on ages 3–15, but some participants in four of the studies were older than 15 years of age (Brinda & Braun, 2017: Grades 9 and 11; Pancratz & Diethelm, 2020: ages 10–18; Papastergiou, 2005: ages 12–16; Yan, 2009: ages 9–17). Studies were grouped into the following age categories: 3–9 (k = 9); 5–12 (k = 2); 8–12 (k = 4); 10–16 (k = 7); 12–17 (k = 5) (see Table S1 for all the studies).

4.2 Coding scheme

The coding process resulted in 60 themes (55 conceptions; 5 other ideas) organized in a three-level hierarchical code system (see Table S2 for all codes, and the number of studies supporting the respective codes and quotations). The top level included four groups (Fig. 2):

  1. 1)

    General perceptions of the internet. This group includes 22 specific themes about the entire internet and thus relates to RQ1 and RQ2. These themes can be categorized to three subgroups relating to perceptions of the internet: a) as a technical artifact, b) in the terms of internet-related activities and practices, and c) as other ideas.

  2. 2)

    Infrastructure. This group concerns elements of the internet’s static infrastructure, thus being most related to RQ2. It includes 16 elements organized into four subgroups concerning a) accessing the internet – generally, b) Wi-Fi-related conceptions – specifically, c) devices helping to resend data over the network, and d) transmission medium – generally.

  3. 3)

    Dataflow. This group concerns dynamic aspects of the internet: data movement and storage. Thus, it is most related to RQ3. It includes 21 themes organized into four subgroups concerning conceptions related to a) storage of internet data, b) sending and downloading data, c) communication speed and quality, and d) addressing.

  4. 4)

    Contradictions. This is a specific code referring to contradictions in understanding for a particular child. We created it in order to obtain prevalence of contradictions.

Fig. 2
figure 2

Top-level categories and subcategories of the coding system (n: number of third-level elements). Supporting literature in this and the following figures is organized along five age categories: 3–9y, 5–12y, 8–12y, 10–16y, 12–17y. Each row (see ‘Contradictions’) represents one age category. The m-dash represents the missing literature in a specific age category. Question mark (?) denotes a borderline evidence

4.3 General perceptions of the internet

  1. A. 

    The internet as technical artifact

Yan’s seminal work defined four increasingly complex conceptions of general internet structure (2005). Here, five conceptions were identified with 7 associated themes (Fig. 3):

  1. 1)

    The internet is difficult to separate from the digital device. We found robust evidence for the majority of the youngest children who have online experiences conflating, in one way or another, the internet with the device for accessing it. This evidence comes from 6/9 papers concerning the 3–9-year-olds (e.g., Edwards et al., 2018; Mertala, 2019). This conception is less frequent among older groups, but it does not disappear.

    Aside from general difficulty in viewing the internet as an independent entity, some children have a specific conception that The internet is something inside the device. This is, again, less frequent among older groups. For example, Yaghobová (2021) found it among 8/28 fifth-graders but 1/28 ninth-graders. However, some statements from the primary literature (e.g., “the internet is a thing on my phone”, Yaghobová, 2021, p. 59, Grade 5) can be interpreted in two ways: the entire internet is inside the device or only part of it is.

    In one study with first-graders (Brante & Walldén, 2021), we found evidence for conceptions of the internet as a Feature of apps that enable internet access and as Something that arrives to the phone; although the evidence for the latter is ambiguous.

  2. 2)

    The internet is beyond the device. Evidence was found in 5/9 of the youngest cohort papers that some 5–7-year-olds already understand that it is necessary to connect to “something” outside the device to get the internet-related functionality, yet exhibit no understanding of the internet as a networked structure. This conception was not dominant among these children but also not rare (e.g., Mertala, 2019; Oliemat et al., 2018). It may be contingent on personal experience with manual (i.e., non-automatic) connecting to the internet or the connection being broken (e.g., Mertala, 2019). The conception was frequently noted also among older cohorts, but some of these participants could have more complex knowledge, which was not probed enough in the studies.

    An associated theme is Struggling with recognizing of online vs. offline behavior; encountered in all age cohorts (5 studies in total) but most frequently in the youngest ones. This idea is different than conflating the device and the internet: one can know that the internet is an outside-device entity, yet struggle with categorizing some activities as online vs. offline. For example, Kumar et al. (2017) data suggested that even some parents correctly perceived web browsing as an online activity, but did not viewing videos through streaming services. Little awareness of what ‘online’ means was also highlighted in a large-scale explorative study about 0–8-year-olds’ experiences with digital technologies (Chaudron et al., 2015).

    The conception that The internet is an application (typically a search app) was registered in four studies in diverse age groups (e.g., Brante & Walldén, 2021; Yaghobová, 2021), but not often (e.g., Papastergiou, 2005, reported frequency 9%). Note that an ‘app’ could refer to the client or server part: we can rarely determine this due to participants’ overall limited understanding.

    Finally, one study with 10–12-year-olds (Murray & Buchanan, 2018) and one with 12–16-year-olds (Papastergiou, 2005) identified conceptions that The internet is a website/web and The internet is a device. The device was probably a ‘large’ device on the network other than that of the user, but the reports are unclear. The website/web conception could be more frequent compared to the device one (cf. Papastergiou, 2005), but evidence is limited.

  3. 3)

    Studies reported that many children above 9–10 years of age view the internet as more than just an entity external to the device or technology for connecting two or a few devices. Two more complex dominant conceptions emerged from the dataset. The simpler one is a Proto-network, an idea that the internet is a wide entity interconnecting many devices: either through a Central point (e.g., Papastergiou, 2005) or without any notion of internal structure (e.g., Yaghobová, 2021; Yan, 2005).

  4. 4)

    The more complex conception is a Large network with a vaguely specified structure: the internet is a worldwide network with complex internal architecture that is only partly or vaguely specified. Literature suggests repeatedly that over half of 10–15-year-olds has either Proto-network or this conception. However, because studies sometimes blur these conceptions’ borders, it is difficult to report frequencies separately for each. Generally, the more complex conception appears to be more prevalent among older participants, but a large number of 14–15-year-olds still possess only the simpler one (e.g., Brodsky et al., 2021; Yaghobová, 2021; Yan, 2005, 2009). Children younger than 9 years of age only rarely have these complex conceptions (e.g., Eskelä-Haapanen & Kiili, 2019).

  5. 5)

    The internet is a network of networks with correctly specified servers and inter-network connections. A correct notion of the internet’s architecture was reported only rarely and only among children older than about 10–11 years of age (Papastergiou, 2005; Yaghobová, 2021; Yan, 2006; see also Yan, 2005, 2009). Moreover, some children with ‘correct understanding’ of the internet (according to these studies) appeared to reach only the previous level.

Fig. 3
figure 3

Main themes related to perception of the internet as a technical artifact

  1. B. 

    Internet-related activities and practices

The review found robust evidence (Fig. 4) that the internet is often viewed, especially among children aged 3–12, not as a technical artifact, but as

  1. a)

    Internet-related activities (such as playing games or shopping; e.g., Eskelä-Haapanen & Kiili, 2019; Murray & Buchanan, 2018),

  2. b)

    A place for information or just for finding some “stuff” such as videos (e.g., Murray & Buchanan, 2018), or

  3. c)

    A communication medium (e.g., Brodsky et al., 2021).

Fig. 4
figure 4

Themes related to perception of the internet (B) in terms of activities and (C) other ideas

Sometimes, there is an overlap between these three conceptions. For instance, viewing videos can be understood both as (a) and (b). The first two conceptions (a, b) were noted in almost all studies with children aged 3–12 (Fig. 4). The last one (c) was also relatively common, but it was noted less frequently among 3–7-year-olds; possibly because the youngest children communicate less often or see someone communicating over the internet compared to, for instance, seeing videos or playing games (cf. Chaudron et al., 2015). These conceptions were encountered less often among children over 12 years of age, but this could be because the respective studies probed them less often.

In the dataset, some children aged below 6–7 years were Familiar with internet-related activities but did not know the word (e.g., Mai et al., 2022; Mertala, 2019) or Knew the word ‘internet’ but could not associate it with internet-related activities (Mertala, 2019).

Children frequently had multiple conceptions at the same time (e.g., Botturi, 2021; Papastergiou, 2005), including activity-based and structurally-based ones. For instance, they could simultaneously view the internet as a Place for information plus a Large network with a vaguely specified structure; e.g.: “A 12-year-old believed that ‘the internet is several sources connected to each other with information. … There is probably billions of web pages and computers.’ He depictured the internet as a complex system with multiple servers.” (Yan, 2005, p. 391).

  1. C. 

    Other ideas

We found five additional, diverse conceptions in the studies (Fig. 4). A relatively prevalent one was the view of the internet as a Non-real/fictitious place; noted almost exclusively among 8–12-year-olds. This place could be a city where buildings are individual apps, a factory, just “some place”, or a mythical environment (Botturi, 2021; Murray & Buchanan, 2018; Yaghobová, 2021). This conception could be prone to invoking when drawing the internet (Botturi, 2021) and it may sometimes be acquired when watching movies (Yaghobová, 2021) such as Wreck-It Ralph.

A perception of the internet as a global entity, The internet is everywhere, appeared in the dataset for the age ~ 8 (Eskelä-Haapanen & Kiili, 2019; Mai et al., 2022) and was also registered repeatedly in older groups. Note that this idea does not exclude the possibility of contradictory awareness of places without internet access.

Studies also reported that the internet can be viewed as an Anthropomorphic entity (e.g., brain; Murray & Buchanan, 2018) or something Related to electricity (e.g., Eskelä-Haapanen & Kiili, 2019). Some children thought that Multiple internets exist (e.g., Yaghobová, 2021). These conceptions were rarely encountered, but it is unclear whether they are truly rare or if studies just did not probe them. For instance, older children would most likely agree that the internet is related to electricity, if asked.

4.4 Internet infrastructure

  1. A. 

    General access

Evidence was found that some young children know that the internet can be accessed by Cables, mobile-created Hotspots (6y: Mertala, 2019) or Cellular data (Grade 2/3: Eskelä-Haapanen & Kiili, 2019) (Fig. 5). This knowledge is uncommon in this age group as it is, again, probably contingent on personal experience (cf. access via phone line reported in an extant work: 10y, Mumtaz, 2002). Older children are somewhat more knowledgeable (e.g., Diethelm et al., 2012a; Yaghobová, 2021), but the frequency is difficult to estimate from the reports.

Fig. 5
figure 5

Key themes related to (A) general and (B) Wi-Fi access

Familiarity with Wi-Fi access was so prevalent that a separate code group was created.

  1. B. 

    Wi-Fi access

Some 5-year-olds have a vague idea that The internet is related to Wi-Fi in terms of access (Mertala, 2019) (Fig. 5). However, based on available data, it is not possible to report prevalence. Among older children, this knowledge is not rare, but it is also not universal. For instance, Yaghobová (2021) reported that 17/28 fifth-graders and 23/28 ninth-graders viewed Wi-Fi as one means for connecting to the internet.

Unsurprisingly, a recurring theme is the ‘Wi-Fi box’ (i.e., a Wi-Fi router: often directly visible at home). Even ~ 7-year-olds (Mai et al., 2022; Mertala, 2019; unknown frequency) exhibited awareness of it.

From the studies, it is seldom clear how complex children’s ideas about Wi-Fi are and how these ideas change as they grow older. However, children clearly exhibit confusion in all age categories. For example, Yaghobová (2021) reported that 11/28 fifth-graders, but only 4/28 ninth-graders, viewed Wi-Fi to be the ‘Wi-Fi box’; whereas, 7/28 fifth-graders and 14/28 ninth-graders viewed it as a wireless signal (a more correct idea). Some children (Grade 5: 7/28; Grade 9: 9/28) had Problems with distinguishing the internet from Wi-Fi. Only 3/56 knew that devices can interconnect via Wi-Fi networks without being connected to the internet.

  1. C. 

    Resending / routing

The simplest conception of how data travels over the internet (data route) is Direct transmission – user end-devices somehow communicate directly with each other. This conception was noted in four studies: all with children aged above Grade 5 (e.g., Diethelm et al., 2012a; Yaghobová, 2021). It is not dominant but also not rare.

However, the internet is a modular network with many networking devices other than home Wi-Fi routers that function as data crossroads resending data to a server or a target user device. These networking devices include routers, switches, or base transceiver stations (BTS) in the case of cellular networks. Lay people rarely come into contact with them directly. Symptomatically, a correct idea of these ‘internet crossroads’ is almost absent in the data set. Instead, three relatively common ‘single-hop’ conceptions were noted repeatedly: data travels towards a Satellite, Central point/computer or Tower, which sends it to the target device (Fig. 6). Satellites could refer to dishes or space satellites. Sometimes these devices were meant (by children) to be used not only for resending but also storing data. In some works, they occurred in up to 40% of cases (one central computer: Diethelm et al., 2012a; satellites: Yaghobová, 2021). However, children sometimes referred not to just one central device but several. This brought them a bit closer to correct view.

Fig. 6
figure 6

Common conceptions concerning data sending: (a) direct transmission, (b) ‘single-hop’ through satellite, (c) ‘single-hop’ through a broadcasting tower. (d) Actual ‘multi-hop’ concept through routers/switches, cables and a server

This correct view is represented by a ‘multi-hop’ idea: data travelling across multiple ‘crossroads’. For some children (scattered across five works), it was registered as a vague notion without them having a detailed understanding; therefore, this is called the Proto-routers conception (e.g., Diethelm et al., 2012a; Lindmeier & Mühling, 2020). A truly Correct notion of routers/switches was rare (e.g., Yaghobová, 2021) and it was related to the structural conception of the The internet is a network of networks with correctly specified servers and inter-network connections as described above.

  1. IV.

    Transmission medium

How do children think data is transmitted after it leaves the home Wi-Fi router (Fig. 7)? Children above ~ 9 years of age often inferred Some signals to be the main transmission medium (e.g., Yaghobová, 2021), possibly due to this idea’s contingency on the conception of satellites (e.g., Botturi, 2021). Various correct, incorrect or unspecific signal types were mentioned (e.g., infrared rays: Brinda & Braun, 2017; ‘some’ waves: Yaghobová, 2021).

Fig. 7
figure 7

Key themes related to (C) resending/routing and (D) transmission medium

However, the more typical transmission medium nowadays is Cables. They were registered less frequently albeit not rarely (e.g., Diethelm et al., 2012a). The dataset does not provide much information about whether or not adolescents know about the existence of undersea cables.

A lack of conceptual knowledge was also noted; likely due to the invisibility of the medium. Specifically, some children ascribed ability to transfer data to Applications, i.e., “smartphones can be connected via applications” without referring to anything external to the devices. Yaghobová (2021) even noted observing one ninth-grader with this conception. This idea, likely based on direct sensory experience with smartphones without any conceptual understanding, may be prevalent among younger children, but the youngest-cohort studies did not examine this issue.

4.5 Dataflow

  1. A. 

    Storage

What do children think about data storage on the internet (Fig. 8)? Data is stored on millions of invisible servers in a distributed way. The idea of Server-like distributed storage (without necessarily knowing the word ‘server’) was repeatedly evidenced among adolescents (8 papers): the earliest in Grade 5 children but as relatively expert knowledge. For instance, Yaghobová (2021) reported that around a quarter of Grade 5 and half of Grade 9 students connected servers with some form of data storage; however, 14/26 Grade 5 and 9/26 Grade 9 students linked servers primarily with playing games. This again highlights the role of personal experience: even among adolescents (e.g., typing the server name when launching a multiplayer game).

Fig. 8
figure 8

Themes related to storage of data

Among the incorrect ideas, storage in a Central computer was most frequent (e.g., 123/308 12–16-year-olds thought webpages are stored in a central computer: Papastergiou, 2005). However, students could have meant multiple central computers and it is sometimes unclear how far this notion is from the correct conception of servers (e.g., Brinda et al., 2018). Less frequent is the conception of storage in Satellites or Only inside user devices (e.g., Papastergiou, 2005), possibly obsolete misconceptions Inside the modem (Hammond & Rogers, 2007), or rare ideas, e.g., “in clouds up in the sky” (Brinda & Braun, 2017). Weak evidence indicates that some of these conceptions can be present also among 6–9-year-olds (Eskelä-Haapanen & Kiili, 2019; Mai et al., 2022; Fig. 8), but the youngest-cohort studies rarely examined the idea of storage explicitly.

  1. B. 

    Downloading / sending data

There is a solid evidence that many 3–9-year-olds have an idea that applications can be “brought to the phone” by some clicking; an idea coded as Pre-downloading (e.g., Edwards et al., 2018; Mertala, 2019; Fig. 9). Based on the studies, it is difficult to estimate how much these children really understand downloading given that many of them conflate the device and the internet. A few of them probably have a Better understanding of downloading than others (Oliemat et al., 2018). In older cohorts, the concept of downloading per se was not probed often, though ~ 80% of European 12–16-year-olds know how to save photo found online (Šmahel et al., 2020).

Fig. 9
figure 9

Key themes related to (B) downloading, (C) communication speed/quality and (D) addressing

Two works registered that some adolescents know that Data is encrypted during transmission (Brinda & Braun, 2017; Lindmeier & Mühling, 2020), but it is difficult to estimate the frequency. Conceptions regarding this topic did not appear to be examined much with K-9 Grade children; this contrary to adults (e.g., Dechand et al., 2019; Whitten & Tygar, 1999).

Anecdotally, various conceptions of streaming were noted in the study by Diethelm et al. (2012a). These include Returning the video back to the server after it ends or Playing the video directly on the server. The same work reported that half of participants believed that Data is sent in packets and the other half that Data is sent in one part.

  1. C. 

    Communication speed and quality

Availability and quality of internet connection depends on Distance to the access point, and this is known even by some 5-year-olds; presumably because of prior experience with this issue (Mertala, 2019; Fig. 9). There is evidence that some children above Grade 4/5 infer other reasons for fluctuating quality and speed of connection, including correct or partly correct assumptions like Thick cables enable quicker transmission (Diethelm et al., 2012a), Larger data travels longer (Diethelm et al., 2012a; Yaghobová, 2021) and Longer distances imply longer transmission time (Yaghobová, 2021), and the incorrect premise a Bottleneck on the central computer (Diethelm et al., 2012a). In general, research data is limited as concerns this topic.

  1. D. 

    Addressing

Around half of the children in Grade 5 and older were found to have vague ideas that computers are identified by some addresses (Brinda & Braun, 2017; Diethelm et al., 2012a; Yaghobová, 2021); coded as Vague IP address conception. Addressing through a phone number appears to be a robust conception among Grade 5+ children (Brinda & Braun, 2017; Yaghobová, 2021), which may be contingent on the use of smartphones and WhatsApp (ibid). Only a few respondents ascribed to the address some additional, correct features such as uniqueness (ibid). Evidence is limited though and is missing for younger children.

For sending emails especially, three older studies with children aged above 9 years (Diethelm et al., 2012a; Hammond & Rogers, 2007; Mumtaz, 2002) reported a Mail box / mail service metaphor, which may, however, be less familiar to children nowadays.

4.6 Contradictions

Evidence of direct contradictions among conceptions was found only in five papers in diverse age groups; a lower number than one might expect. However, contradictions in children’s conceptions were exhibited also implicitly without the original authors noticing it. For example, as noted above, children could hold a view that The internet is everywhere and at the same time they knew they could not access the internet without signal (e.g., while they are in a forest)Footnote 3.

5 Discussion and conclusion

How do 3–15-year-olds understand the internet? Based on this review, the following picture is emerging.

Concerning general conceptions about the internet (RQ1), preschoolers who have had prior online experiences tend to view the internet in the terms of online activities; typically viewing videos or playing games. In the age category of ~ 6–9 years, some of them also adopt a view of the internet as a place for information and communication: likely contingent on the child’s growing experiences. As children grow older, some, but not all of them, start to build gradually an increasingly complex and more and more sophisticated structural conception of the internet: ranging from ‘the internet is something connected to the device’ to ‘a complex, global network’. Yet, even at the lower secondary level (ISCED-2), there are children who lack basic structural understanding and remain with an activity-oriented one. Almost no 15-year-old has a correct understanding of the internet.

In short, digital natives are not digital experts and growing personal experience does not necessarily translate into a better understanding of the internet’s underlying fabric. This part of the picture was already suggested in previous studies (e.g., Chaudron et al., 2015; Yan, 2005, 2006, 2009), but this review corroborated it with new evidence; especially, with a specific focus on internet infrastructure (RQ2) and dataflow (RQ3). More precisely, this review highlighted the following:

  • In the terms of internet access, a recurring theme is the ‘Wi-Fi box’ (Wi-Fi router) through which many children appear to start to understand Wi-Fi/Internet connection. However, regardless of age, the majority of children exhibit confusion about what the Wi-Fi signal/network really is.

    • As for other forms of access: even some preschoolers can understand them, and their knowledge is probably built vis-à-vis new experiences.

  • When it comes to the invisible internet structure (rather than just visible access to it), understanding becomes severely limited even among lower secondary school students (while data for younger children is rare).

    • Some children think devices communicate directly with each other.

    • A dominant role in children’s ideas is played by satellites, followed by central computers and broadcasting towers: both for storing data and resending it. Resending is often meant in a ‘single-hop’ fashion: through one intermediate point.

    • A ‘multi-hop’ idea – that the internet has many ‘crossroads’ (networking devices) and data is managed via servers – is less frequent, albeit not rare: we estimate that approx. 1/4 – 1/3 students have this idea at around the age of 12–14. Converging view on limited understanding of children how their data are stored and handled on the internet has been echoed also in some digital literacy and privacy studies (e.g., Bowler et al., 2017; Sun et al., 2021; reviewed in Quayyum et al., 2021).

  • The concept of downloading is familiar even to young children, but probably only some adolescents know how it actually works.

  • Adolescents appear to have some idea of data encryption, addressing and fluctuations in connection quality; but literature concerning children is surprisingly scarce.

On a theoretical level, the present results can be easily understood in the Vygotskian framework (1987; Edwards et al., 2018). Children start to build their understanding through personal experiences only (‘everyday’ conceptions) and gradually enrich them through explanatory-based conceptions of the underlying invisible mechanisms (‘scientific’ conceptions). Two points are noteworthy in this regard.

First, children appear to have similar conceptions today as they had 20 years ago: almost no conceptual shift is demonstrated, aside from the absence of some ideas related to obsolete technology, such as ‘internet access via phone line’ (note, however, that the oldest study with 3–9-year-olds in this dataset is from 2011).

Second, the review suggests that internet conceptions are predominantly experience-based even in adolescence. Related to that, we also informally noted that children’s knowledge appears to be fragmented; although direct contradictions in children’s statements were not frequent. This could suggest a lack of educational scaffolding (despite updated curricula; e.g., Hubwieser et al., 2015; Mannila et al., 2014) along with general difficulty in acquiring conceptual understanding of the internet (Bordoff & Yan, 2017) or, more generally, of invisible parts of any complex networked artifact, e.g., an electric grid system (Hallström & Klasander, 2017). This difficulty can be a manifestation of a well-known conceptual change phenomenon: that acquiring knowledge of some concepts is surprisingly difficult because prior conceptions complicate learning (Vosinadou, 2013).

How does this review contribute to education about and with internet-related technologies? By making child conceptions explicit, the review simplifies for practitioners the activation of relevant prior knowledge in children in order to update their beliefs (rather than creating a separate and possibly contradictory understanding of the concept; cf. Diethelm et al., 2012b; Duit et al., 2012). Furthermore, the results suggest that it could be fruitful to study ‘invisible through visible and enacting’ (cf, Hallström & Klasander, 2020). For instance, accessing the same cloud document from multiple devices in the classroom at the same time or switching the ‘classroom Wi-Fi box’ on/off when children are connected to it can materialize the internet as an outside-device entity (cf. Mertala, 2019). Tracking the cable from the visible classroom Wi-Fi router to the ‘invisible’ school server room can draw attention to otherwise hidden computing devices. These devices can also be visualized by means of digital simulations (cf. Prvan & Ožegovič, 2020).

When interpreting the results, however, it is important to keep in mind limitations associated with both the review and included studies.

  1. 1.

    Some conceptions have rarely been examined among specific, if not all, age groups. For instance, there appears to be limited information about addressing, or transmission media among primary school children; or about cookies or functioning of the web for any age group. In this regard, it is necessary to restate that absence of evidence is not evidence of absence: the fact that that the review did not find much evidence, for instance, that 10-year-olds know that the internet-based communication depends on cables does not automatically imply that they do not know about it.

  2. 2.

    Some of the reviewed studies did not deeply scrutinize knowledge of the participants. For example, if children drew their ‘idea about what the internet is’ as a monitor with the YouTube website, this does not necessarily mean that all of these children exhibited no structural conception of the internet (as is sometimes implied by drawing studies, cf. Botturi, 2021).

  3. 3.

    Some themes were inferred in this review and the primary studies from children’s statements, but they may imply different meanings or they may be mere linguistic metaphors. For example, it is not always clear what children mean by ‘central computers on the internet’: one computer, several of them, or millions of them? Perhaps, they mean servers but simply do not know the word ‘server’. ‘Satellites’ could also represent satellite dishes as well as satellites orbiting earth or both (as is apparent from the drawing studies). The statement “I have the internet in my phone” could signify internet connection rather than the idea that the internet is a tangible component inside one’s device. More detailed clinical interviews (e.g., Ginsburg, 1997) and linguistic analyses (cf. Brante & Walldén, 2021) would probably be required to overcome these issues.

  4. 4.

    The review does not show developmental pathways for individual children creating their conceptions, because the studies are not longitudinal.

  5. 5.

    One could speculate that children’s understanding may differ depending on the country in which they live. This could be, for example, due to country-specific parental mediation strategies or media exposure. However, the sample is currently too small to examine this issue.

These limitations do not, however, undermine the key observations of this review. They suggest possible future steps for the field. Studies should increase their breadth with more nuanced age groups and themes, together with their depth (details of the investigation). Bringing more information about sources of children’s knowledge would be also illuminating: especially, pinning down when the source is a peer, a popular movie, self-learning, formal education or something else. Longitudinal studies would be a welcomed addition. A future review can expand the dataset by including data literacy and privacy studies (see Quayyum et al., 2021), which would bring to light digital footprint and surveillance-related conceptions: both technical and social ones. Finally, finding ways to improve internet understanding should be the ultimate goal. This latter branch of research is in our next research scope.