This blog describes the design and architecture of my portfolio website. For no real reason, I wanted to build everything from scratch and have complete control of how I create and manage the data which goes on the site, on my own. On a high level, this site has 3 segments, pre-rendered public pages which you are currently seeing, admin pages to create, update and delete blogs, projects and updates, (where I am writing this blog), and a backend to manage auth and data. So there are two products managing the frontend, one for everyone and another just for the admins, both built on React but use different rendering techniques to optimise the experiences. And then we have the API service backend written on Node+Express to manage all data coming in from the admins, and going out to everyone. Let's look into each of these in the following segments.
The heart of the product is its backend which is built on Node+Express and coded with the magic of TypeScript. It follows the typical backend practices and connects to a MongoDB server hosted on MongoDB Atlas to fetch, update and delete the blogs, projects and updates for the website. It exposes a bunch of APIs, the get methods provide support for both public pages and admin pages, and the post methods provide support for only the admin pages to manage the website's data.
The initial goal of writing the API services was not only meant to handle the data of the website, but to have a central server for any product I'll be building in future as well. The reason was to optimise the cost of hosting the server on a public network, accessible from anywhere. With a central service managing APIs for all my products, I'd not need to deploy multiple servers on multiple instances, for a very limited set of audiences to start with.
Let's address the very basic question of why 2 frontends for 1 site. Why couldn't there have just one frontend with capabilities of both, serving public pages and admin pages?
I'd initially started with a single-page application (SPA) built on React and was extremely happy when the end-to-end flow of CRUD operations was successful and deployed. But soon, I realised that this SPA React application doesn't work well with SEO. To generate a production build to deploy it, webpack would bundle all of React's logic into javascript files and these files will be embedded into a template HTML, which when served to the requested, would request for the javascript files to load separately. This isn't good for SEO as everything on the site is essentially an output of Javascript running. For example, the very first operation of the web app was to render the index component into an element on the HTML template. So, if you block all javascript on the page, then even the top-level React component will not be rendered and all you'll get is the template, which most likely will be a blank page.
Now, let's talk about how it'd affect SEO. Google crawler for indexing pages, is not famous for waiting for the page's javascript to load before it moves on to another page to index. So when the crawler requests ankitsinha.in, it'll just see a blank page and move on to another website to index. Essentially, the crawler understood nothing on the page and there's nothing for it to index it on. And this is the reason I felt the need to migrate to a different technique of rendering the pages.
There are 3 famous techniques for serving web content to users -
Client-side rendering;
When you build an application with React and use Webpack or any other bundler to bundle it, it produces one big chunk of initial javascript, which when downloaded after the initial HTML page load, flushes the components and data into the HTML. So the render and data fetch, happen on the client side. This is an example of Client-side rendering.
Server-side rendering;
Now with server-side rendering, the user requests a page, the server understands what is being requested, tries to fetch all the data the page relies upon, creates an HTML page out of it (template + data), and then sends it back to the client. So the client receives the first HTML page for the URL it requested, it already has all the information which the page needs! This prevents the issue of SEO, however, it poses another problem.
Let's say there's a page which depends upon some API call, which internally requests the DB and serves it back to the page. So on every request of the page, the server will attempt to call the API to fetch data. More page requests, and more API calls. It's an unnecessary overhead if your APIs return the same data again and again and the data changes only occasionally. This process reduces the performance of the page and increases the load on the APIs. Although there's a solution to this as well, in which you apply the techniques of caching to improve the performance, that's a story for another blog. :D
Static site generation;
So the choice comes down to having, static-site-generated pages. This technique fetches data and creates HTML files during build time itself and the generated pages are served statically when the URL is requested. This helps you quickly serve the pre-rendered pages. Additionally, you get the power to cache these static content on CDNs to improve the performance even further. However, let's talk about a few issues with this approach as well. Since the pages are generated during the build, any changes to the data will require a re-fetch of the APIs on the page, which will only happen if we build the pages again.
Although SSG blocks me from auto-updating the content on the website whenever I push a blog from admin, it was the fastest way for me to migrate to a rendering technique which would heavily support SEO.
So how do we migrate a fully functional CSR application, supported with Authentication, CMS (content management system) for blogs, projects and (twitter copy) updates, to server-generated pages? The admin pages (CUD) for blogs, projects and updates can not be migrated to SSG so easily as they deal with real-time data. So I decided to separate my frontend into 2 separate apps. The first one will contain only public pages, or the pages containing the published data, generated server-side and cached on CDNs. And the second app will just consist of pages to manage CUD. This approach according to me was the quickest to implement without compromising the abilities of the website. I'll cover the details of how this was done in a separate blog. :D