08/11/2024

[Google Sheets] Import data from Investing.com

DISCLAIMER: This is not a stock picking recommendation, the tickers shown are used as example and not provided as financial or investment advice.

DISCLAIMER2: Web scraping can go against usage policies so read them, understand them and abide by them. Also web scraping is notoriously flaky as small changes in the retrieved page can affect the location of the desired element(s).

We have already seen how to import data from Yahoo Finance and Coingecko, now we try to add Investing.com as source which unfortunately does not provide API, so we have to do some HTML scraping instead.

We would like to retrieve the current price of an ETF called JPNA. By looking at that page we can luckily identify a specific locator, which then allows us to do a couple string manipulation operations before getting the desired value:

function getInvestingData(path) {
  var result = UrlFetchApp.fetch("https://www.investing.com/etfs/" + path.toLowerCase());
  var response = result.getContentText();
  var start = response.indexOf(' data-test="instrument-price-last">') + 35;
  var end = response.indexOf('</div>', start)
  var out = response.substring(start, end);
  var strip = out.replace(",", "");
  return Number(strip);
}

19/10/2024

[Google Sheets] Import crypto data from Coingecko

DISCLAIMER: I do not recommend investing in crypto, all tokens you will see are used as sample and not provided as financial or investment advice.

We've already seen how to add custom functions to our Google Sheets projects, importing Yahoo Finance data. Now we expand this to import cryptocurrency data from Coingecko.

I could have picked any of the million sources, but this one has a simple free plan which for light usage works well.

After registering and getting your API key, RTFM to find that they provide a lot of stuff, what we care about is the pricing data in this example, you will need to find the ID of the currency you want (NOT the token) by querying the list and then you can get the data you want by calling (add your API key at the end):

"https://api.coingecko.com/api/v3/simple/price?ids=" + token + "&vs_currencies=usd&x_cg_demo_api_key="

You can put this in a Google Sheet function as well (you can extend input parameters to get quote in different currencies as well):

function getCoingeckoData(path) {
  var result = UrlFetchApp.fetch("https://api.coingecko.com/api/v3/simple/price?ids=" + path + "&vs_currencies=usd&x_cg_demo_api_key=YOUR_API_KEY");
  var response = result.getContentText();
  var json = JSON.parse(response);
  return json[path]['usd'];
}

04/10/2024

[Java] Get type of elements in collection using reflection

Java type erasure logic means that at runtime some information is removed/replaced from generic declarations, which poses a minor challenge when trying to identify the actual parameter types using reflection at runtime.

Here is a way:

import java.lang.reflect.Field;
import java.lang.reflect.ParameterizedType;


//maybe you are looping over fields or have a field of type Field f
ParameterizedType collectionType = (ParameterizedType) f.getGenericType();
Class<?> actualClassOfElementInCollection = (Class<?>) collectionType.getActualTypeArguments()[0];

27/09/2024

[Java] Load entity with lazy children collections

When using JPA and lazy collection loading, at the time the parent entity is retrieved from DB, the lazy children are NOT also loaded in memory, instead, a proxy reference is added, which is used to retrieve the data if that particular field is ever accessed.

In some scenarios you might want to load the whole entity, including the lazy children in memory instead (eg you want to clone/serialize it, whatever).


If you want to load ONE child together with the parent, a JOIN FETCH clause would do the trick: 

SELECT p

FROM Parent p

LEFT JOIN FETCH p.childField c

WHERE p.id = :id


But if you try to load more than one child at the same time, you will get a MultipleBagFetchException. A workaround is to call the load with JOIN FETCH for all entities you need sequentially, for example:

private Parent loadChildViaQuery(ID id, String childClause) {
  return entityManager
    .createQuery(
      "select p " +
      "from Parent p " +
      "left join fetch " +
      childClause +
      " where p.id = :id",
      Parent.class
    )
    .setParameter("id", id)
    .getSingleResult();
}

Where childClause input is the join statement you need, for example:

"p.childField c"

Also remember that if the parent entity is not found, the operation would throw a NoResultException, while if the child is not found, no exception is raised, the child is simply null/empty.

26/09/2024

[Java] Using annotation processing (and validating it) to execute logic at runtime

Sample project showcasing how to use annotations to perform runtime logic to log changes in object values.

Remember it is a SAMPLE, so obviously (lazy me) most null-safe checks and so are not included and obviously some logic is a showcase, should be replaced with a real business scenario to implement.

In this SAMPLE, we tag fields to be included in a diff logic to then print to output when those fields values change by comparing two instances of the same object.

Logic can obviously be made much more complex including collections and maps and whatnot (Java Generics are your friends there).

Also worth mentioning JaVers can do most of it for you, unless you have fancy business requirements that force you to write custom code..


Key points:

- How to create an annotation

- How to create an annotation processor to validate the annotation parameters at compile time

- How to register an annotation processor

- How to configure a multi-module Maven project to use a custom annotation processor (also, in the pom of the root project ensure the module containing the processor is built BEFORE everything else)

- How to test an annotation processor by generating classes at runtime and trigger compilation tasks agains them. Includes verifying compilation warnings are properly triggered as expected.

- How to use reflection to get fields annotated with a given annotation (and then execute whatever logic on them)

- How to use reflection to invoke methods (including static methods)


The full project is available on my GitHub repo with commented code: https://github.com/steghio/diff-annotation-processing

05/09/2024

[Java] Serialize POJO to XML according to XSD

If you generate a class from an XSD schema, it will come with the necessary annotations to serialize it to an XML String.

You can therefore easily convert it with:

import jakarta.xml.bind.JAXBContext;
import jakarta.xml.bind.Marshaller;
import java.io.StringWriter;

/**
 * Provides utility methods for serialization scenarios
 */
public class SerializationUtils {

  /**
   * Serialize the given object to XML String using JAXBContext
   * It will set the output to be pretty printed
   * It relies on the object annotations to correctly place and annotate all fields
   * @param object
   * @return the string representation of this object as XML
   * @param <T>
   */
  public static <T> String serializeXml(T object) {
    try {
      JAXBContext jc = JAXBContext.newInstance(object.getClass());

      Marshaller marshaller = jc.createMarshaller();
      marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
      //to completely remove the xml preamble `<?xml version="1.0" encoding="UTF-8" standalone="yes"?>` add this line:
      //marshaller.setProperty(Marshaller.JAXB_FRAGMENT, true);

      //marshaller cannot output to string directly
      StringWriter sw = new StringWriter();

      marshaller.marshal(object, sw);

      return sw.toString();
    } catch (Exception e) {
      throw new RuntimeException("Failed to convert payload to xml. ", e);
    }
  }
}

[Java] Generate POJO from XSD in Maven

Assuming you have a nice correct XSD file with proper namespace references and all, then you could convert it to a POJO (or more) using jaxb-maven-plugin

There are multiple plugins that would achieve the same result and multiple versions of this plugin even, so searching on the web can be confusing. In year 2024, this works simply with adding a plugin in the POM:

<plugin>
  <groupId>org.jvnet.jaxb</groupId>
  <artifactId>jaxb-maven-plugin</artifactId>
  <version>4.0.8</version>
  <executions>
    <execution>
      <id>NAME_FOR_THIS_RUN</id>
      <goals>
        <goal>generate</goal>
      </goals>
      <configuration>
        <schemaDirectory>src/main/resources/FOLDER/USE_CASE</schemaDirectory> <!-- here will be the XSD -->
      </configuration>
    </execution>
  </executions>
</plugin>

21/08/2024

[TypeScript] Debounce function in vanilla TypeScript

Debouncing is a technique to execute a function only once during a configured period while specific trigger conditions are true. This can be used to improve performance and user experience.

The idea is to track the last time the desired function was invoked, and update such timer on each subsequent invocation, resetting the timer to the new time. If no other call is made within the configured delay after the last time of invocation, the function will finally be triggered at the end of the period.

19/08/2024

[Java] Prim algorithm to find Minimum Spanning Tree in a graph

The minimum spanning tree (MST) is a subset of all edges in a weighted, undirected, connected graph such that the resulting graph is still connected and the sum of all edge weights is minimal.

If we are not given a list of edges, but only a list of vertices and a formula to calculate the edge weight given two graph nodes, we can run a preprocessing step to generate ALL possible edges between ALL graph nodes and calculate their weight in O(V^2).

Then, starting from a random node, we greedily choose one reachable vertex which has minimal distance from the current node. We continue exploring until all graph nodes have been touched.
There might be multiple valid MSTs for a given graph, this algorithm will return one of them.

We use a queue sorted by weight to determine which edge (and therefore node) to visit next, this ensures that if a node is reachable via multiple edges, we always pick the smallest weight for it. Since the graph is fully connected, we are ensured eventually we will have picked ONE edge between each node in the graph, and the sum of weight of all the chosen edges is minimal.

This runs using O(E) space since we might add all edges to the queue and O(E log(E)) time since for each edge we add to the queue we pay the O(log(E)) cost of insert and remove operation.

If the graph was NOT connected, this will NOT return the MST, only the MST for the connected component where the chosen start node resides. We could adapt the algorithm to verify whether there are extra nodes not yet visited, and repeat the processing for each until we have created a MST for each connected component in the graph.
In case the graph is not conencted, an alternatve can also be Kruskal's algorithm.

You can find my implementation of primMinimumSpanningTree on my Gist along with some tests in PrimMSTJTests.

17/08/2024

[Java] Graph union find algorithm

For an undirected graph, we can compute the disjointed sets that represent all connected nodes in the subgraph where each node resides.

For each set, we elect a representative, all nodes reachable in a set will have the same representative. The resulting view will be a tree where the representative sits at the root and all connected nodes are its children.

Example applications include: quickly verify whether 2 nodes in a graph have a path to each other (they must belong to same set) or calculating the minimum number of edges to add to a graph to make it fully connected (or the opposite).

It is based on 2 operations:

find(Vertex x)

which will return for a given node, the representative of its subset. We recurse up the tree where this node resides until the representative is found. We optimize the operation for future searches by including path compression, where once a representative is found, all nodes along the same path are updated to track it. This makes it so that the find operation runs in O(inverse Ackermann(V)), which is considered O(1) but more realistically is O(log(log(...(V))) or how many times we need to apply log(x) to its result starting with V until the output is less than 1. It is an extremely slowly increasing sequence.

union(Vertex x, Vertex y)

which will connect the subtree where node x resides to the subtree of node y, unless they are already part of the same subtree. To improve efficiency we track in O(V) extra space the rank of each subtree (its depth) and when merging two subtrees, we connect the one with minimum depth to the other, so the overall height of the resulting tree is kept as flat as possible.

It uses O(V) extra space, to track for each node who is the representative of its subset and O(V) to track the rank of the subtree rooted at each node.

It runs in O(V Ackermann(V)) time since we run 2 find operation for each edge (pair of nodes) we unite and use the union by rank with path compression method.

You can check my implementation of unionFind on my Gist along with some tests in UnionFindJTests.

12/07/2024

[git] Add git options to Gitlab pushes

Using git it is possible to specify options during push operations to perform specific actions, for example you can have a successful push autocreate the PR/MR and even set additional flags.

You can do this by adding push-options -o OPTION directly via CLI when pushing or setting them globally in your .gitconfig file.

For example, when using GitLab adding the following options to gitconfig:

[push]
    pushOption = merge_request.create
    pushOption = merge_request.target=main
    pushOption = merge_request.squash
    pushOption = merge_request.merge_when_pipeline_succeeds
    pushOption = merge_request.remove_source_branch
    pushOption = merge_request.title="TODO CHANGE ME"
    pushOption = merge_request.draft

after each push you will:

- automatically create a draft MR
- set target branch main
- set title "TODO CHANGE ME"
- assign it to you (and respect any MR template you have eg for default reviewers) 
- set the flags to squash commits on merge
- set the flag to delete source branch on merge
- enable the automerge when all configured checks (pipeline success, required approvals, etc) pass

This small automation will likely help you speed up your development and help you include the CM concepts in your CICD pipelines