# Re: Can someone add some more HTML tags, please? (Dries) (created Feb 9, 2021) Gemini critics suggest using a subset of http and HTML to create a "Small Web" or "Safe Web" for documents, instead of using a new application layer protocol. I infer from this post by the creator of the Drupal CMS that it would difficult for a Small Web community to settle on an acceptable subset of http and HTML. How much of the good parts of the modern web would be permitted in the web subset? To create a web subset or a Small Web, critics suggest using HTTP version 0.9 or 1.x, but what's wrong with using HTTP 2? => https://dri.es/can-someone-add-some-more-html-tags-please Would a Small Web subset only support CSS2 and HTML4 with no support of JavaScript? If so, then an existing web browser like NetSurf would work fine from the start. It would be nice, however, if the Small Web offered basic HTML form support and the POST request, along with the GET request. But this type of interactivity might violate the Small Web principle. In my opinion, the http/HTML subset idea would fail because no agreement would be reached on the definition of the subset. Would the Small Web need to support dynamic, client-side applications, or only a web of documents? The following post was one of the best articles that I read in 2019. => http://blog.danieljanus.pl/2019/10/07/web-of-documents/ Excerpts from the Web of Documents post: > We don't have a Web of Documents anymore. > A book is a document, to me. So is a picture, an illustrated text, a scientific paper, a MP3 song, or a video. By contrast, a page that lets you play Tetris isn't. The essence of this distinction seems to be that documents have well-defined content that does not change between viewings and does not depend on the state of the outside world. A document is stateless. It exists in and of itself; it is its own microcosm. > These days, the WWW is mostly a Web of Applications. > But the more I think about it, the more sense it makes to me to attack the problem closer to its root: to decomplect the notions of a document and an application; to keep the Web of Applications as it is, and to recreate a Web of Documents--either parallel to it, or as its sub-web. solderpunk chose another route in the summer of 2019: create a new application layer protocol called Gemini to support the Internet of Documents concept. Gopher supports this notion too. Gemini, however, makes the reading and writing experiences a little easier than Gopher. More from the danieljanus.pl post: > To do this, we need to take a step back. (Or do a clean start and invent a whole new technology, but this is unlikely to succeed). What's the author's definition of "succeeding?" As of now, February 2021, would the author consider Gemini a success thus far? I would. Definitely. It's an amazing success. Back to the Web of Documents post with my comments interspersed. > Fortunately, we don't have to travel all the way back to 1992, when the WWW was still a Web of Documents. (I still remember table-based layouts and spacer gifs, and the very memory makes me shudder). I think we can base the new Web of Documents on ol' trusty HTTP (or, better, HTTPS), HTML and CSS as we know them today, with just three restraints: > 1. No methods other than GET (and perhaps HEAD). POST, PUT, DELETE and friends just have no place in a world of documents. Gemini only offers the GET request. Gemini supports the Common Gateway Interface, but the only form item permitted is one text input field per GET request. > 2. No scripts of any kind. Not JavaScript, not WebAssembly. Not even to enrich a document, such as syntax-highlight the code snippets. This one may seem too stringent, but I think it's better to err on the safe side, and it's very easy to enforce. Gemini satisfies that restraint. Gemini mostly satisfies the first restraint too. > 3. No cookies. Cookies by themselves aren't interactive, but having them makes it all too easy to abuse the semantics of HTTP to recreate sessions, and on top of them reinvent the app-wheel and eventually forfeit the Web of Documents again. Gemini satisfies restraint number 3. > How do we achieve this? I don't know, really. Ah, yes. The mystery. This is why Gemini is the better answer because it exists NOW while the subset web does not exist in any official, restricted manner. The author likes the idea of using some aspects of HTML5 and CSS3 in the Web of Documents. That sounds like too much technology for documents. Maybe some of HTML5's semantic tags could be useful, but for documents, Gemini text probably satisfies most concerns. If Gemini text is too limiting, then the Small Web group should focus on an enhanced version of CommonMark and eschew the usage of HTML and CSS and let Small Web websites serve up only Markdown .md files, which means that SmallWeb browser users control the display of the content. As mentioned in many previous posts, here is my test site Markdown site that I started two years ago, in January 2019. Only Markdown is used, except for the index homepage, which is used to explain what in the hell is going on. => http://md.soupmode.com/home.md Example posts: => http://md.soupmode.com/2019/12/05/block-ads-on-websites-not-within-podcasts.md => http://md.soupmode.com/2018/09/20/its-easy-to-eschew-social-media.md => http://md.soupmode.com/rossford-island-view-park-in-the-fall.md When viewing those Markdown posts within an up-to-date version of Firefox or Chrome, a reader will see plain text. We need a SmallWeb browser that renders Markdown and displays the content according to typography settings made by each reader. The Markdown viewer web browser extensions offer themes that readers can choose from within the extension, and the extensions permit readers to upload or paste in custom CSS. This will make all Markdown websites display the same to each individual reader. My typography choices will be different from other readers. Actually, a Small Web Markdown browser exists. It's called Kristall, which is an Internet Browser that supports the following application layer protocols: Gemini, Gopher, Finger, and HTTP/HTTPS, but only a subset of HTML is permitted. Kristall does not support inline or embedded images. I can read http://md.soupmode.com/home.md and the posts listed on that homepage within the Kristall browser. Kristall permits readers to control the typography settings. I'm guessing that the Web of Documents or the Small Web crowd would find the Markdown-only web pages to be too limiting. And this is why I love the Gemini application layer protocol. Gemini along with the Kristall browser satisfies my thoughts that led me to create md.soupmode.com. Instead of advocating for some kind of subset web, I will use Gemini. Back to the fall 2019 Web of Documents post: > How do we achieve this? I don't know, really. I don't have a concrete proposal. Perhaps we could have dedicated browsers for the WoD; perhaps we could make existing browsers prominently advertise to the user whether they are browsing a document or an application. On top of all the technical decisions to make, there'll be significant campaigning and lobbying needed if the idea is ever to take off. The NetSurf, Links2, and other limited web browsers create Small Web experiences, but most of these browsers do not support HTML5 nor CSS3. Simply using NetSurf may not satisfy the author of that WoD post. The campaigning and lobbying for the Small Web could be something to watch. This is why I think that it's easier to create a new application layer protocol like solderpunk did. The author concluded with: > I don't dare dream that it ever will. My intent in this article is to provide food for thought. All I ask from you, my reader, is consideration and attention. And if you got this far, chances are I got them. I'm grateful. > This page is a document. Thank you for reading it. That post was published in October 2019, four months after solderpunk announced project Gemini. On the author's blog, I see no mention of Gemini since the WoD post. Now to Dries January 2021 post. > Every day, millions of new web pages are added to the internet. Most of them are unstructured, uncategorized, and nearly impossible for software to understand. It irks me. > Look no further than Sir Tim Berners-Lee's Wikipedia page: Would Wikipedia pages be considered documents? The editing of the pages requires forms or JavaScript and maybe the POST and PUT requests, but the READING of Wikipedia should be a Web of Documents experience. I should not need JavaScript to read TBL's Wikipedia page. => https://en.wikipedia.org/wiki/Tim_Berners-Lee TBL's Wikipedia page displays fine within the NetSurf web browser. It displays good enough within Links2. It displays okay within Lynx, except for a couple paragraphs that contain some highlighted passages for some reason. Maybe the HTML is malformed. Links2, links, Lynx, w3m, and other similar browsers remove content display concerns from the publishers. Links2 permits readers to control basic typography settings. Links2 is like a permanent reader mode, except all of the cruft and crapware don't get downloaded first. NetSurf permits publishers to offer default display settings for the content. I still like the idea that Gemini makes publishers focus on creating content, instead of creating user experiences. => https://www.netsurf-browser.org/ NetSurf supports HTML 4.01 and CSS 2.1. Does that satisfy the Small Web/Safe Web/Subset Web crowd, or is that too limiting? That seems plenty to me, especially since NetSurf and Links2 support forms and the POST requests. But the WoD author wants the WoD Web to support HTML5 and CSS3 . Both of those so-called standards will continue to grow. Dries wants more HTML tags. That does not feel Small Webby to me. From Dries: > Wikipedia is the world's largest source of knowledge. It's a top 10 website in the world. Yet, Wikipedia's markup language is nearly impossible to parse, Tim Berners-Lee's Wikipedia page has almost 100 HTML validation errors, and the page's generated HTML output is not very semantic. It's hard to use or re-use with other software. > It's not just Wikipedia. Every site is still messing around with custom
s for a table of contents, footnotes, logos, and more. When Gemini critics suggest using a subset of the web, the above comment by Dries is one of the reasons why I prefer a SEPARATE application layer protocol. As a READER, I want full control and easy control over how the content displays to ME. I consider TBL's Wikipedia page to be a Web of Documents-type of post. Why does this WoD page need to use so much complicated or convoluted HTML? More from Dries: > I could think of a dozen new HTML tags that would make web pages, including Wikipedia, easier to write and reuse: , , , and many more. > A good approach would be to take the most successful Schema.org schemas, Microformats and Web Components, and incorporate their functionality into the official HTML specification. Again, I'm thankful for Gemini's existence, and it's usage of Gemini text. No need for tag soup. And since the spec is nearly frozen, Gemini won't get bloated with too much crap. When I supported more IndieWeb.org tech on my website, I enjoyed using Microformats. Would a Small Web or a subset web support Microformats, or is that too complicated? I doubt that a consensus would ever be reached on what defines the Small Web or the Web of Documents. More from Dries: > Adding new semantic markup options to the HTML specification is the surest way to improve the semantic web, improve content reuse, and advance content authoring tools. I cannot believe that after 30 years, the HTML spec still does not have enough features to satisfy Dries. Maybe the Small Web/Subset Web would START OVER from scratch, creating a new HTML spec that would contain more semantic tags and other useful tags while leaving out what's mostly unnecessary, assuming that a consensus could be reached. Maybe Gemini has succeeded thus far because solderpunk is the benevolent dictator for now. A community has collaborated to develop Gemini, but solderpunk seems to have final say on what makes it into the spec. Maybe this works better, for now, than having a large group of people voting to make decisions, which could lead to bloating the spec. Dries concluded with pushing to increase the size of HTML. > If you want to help make the web better, you could literally start with Sir Tim Berners-Lee's Wikipedia page, and use it as the basis to spend a decade pushing for HTML markup improvements. It could be the start of a long and successful career. What would a Gemini version of TBL's Wikipedia page look like? It could be harder to read because of the way Gemini handles links, which I like. I'm not the biggest fan of links used within the body of the text when the publishers remove the underline from the links and make the link text display with a color that's similar to the body text. Dries did not provide any reason why he would like the Wikipedia page to use more semantic tags. How would he reuse the content? -30- ``` dir : 2021/02/09 ```