0%

Graph Problem List

Grid Graph Problem List

Graph

DFS

Find connected components, check for cycles

261. Graph Valid Tree

Undirected graph, check if it’s acyclic and connected.

Using vis[i]=True is sufficient. However, since it’s an undirected graph, when traversing neighbors, you need to skip the parent node, so it needs to be passed as a parameter.

Collection Interface

collection

List

ArrayList

Basic Content

  1. Allows null values
  2. Memory is contiguous, saving memory space.
  3. Insertion time complexity: Inserting at the head is O(n) because all subsequent elements need to be shifted one position back; inserting at the tail is O(1) if no expansion is needed, but if expansion is required, it first needs O(n) to copy the original ArrayList, then performs O(1) insertion operation. Inserting in the middle is O(n).
  4. Deletion time complexity: Corresponds to insertion time complexity.
  5. Not thread-safe.

Underlying Implementation

  1. Implemented by dynamic array, random access to any node only requires calculating its address value, so random access is O(1). Implements the RandomAccess marker interface.
  2. If no initial capacity is specified, the default initial capacity is 0. When adding the first element, capacity becomes 10. Each dynamic expansion increases capacity to 1.5 times the original.

Difference Between ArrayList and Array

  1. ArrayList can dynamically expand; Array cannot, its length is fixed and must be specified during array initialization
  2. ArrayList supports generics
  3. ArrayList only supports objects; for primitive data types, their wrapper classes like Integer, Double must be used. Array supports both primitive data types and objects.

LinkedList

Underlying Implementation

  1. Implemented with doubly linked list. Insertion/deletion at head/tail has O(1) time complexity
  2. Does not support random access. Time complexity for accessing a specific element is O(n)
  3. Performance is inferior to ArrayList, rarely used in actual development.
  4. Not thread-safe.
  5. Allows null values
  6. Besides data, it also stores pointers, so it occupies more space than ArrayList.

HashMap

Implementation Principle

  1. Implemented with hash table + linked list/red-black tree. Methods to resolve hash collisions include chaining/probing, generally using chaining.
  2. How is hash value calculated? Get the key’s hashcode(). XOR the original hashcode with the value of hashcode right-shifted by 16 bits.
1
int hash = (key == null) ? 0 : (key.hashCode()) ^ ((key.hashCode()) >>> 16);
  1. Calculate array index. hash&(length-1), equivalent to hash%length. Binary operations are faster than modulo operations.
  2. If collision occurs, search in the linked list or red-black tree to see if the key exists. If it exists, directly replace the value; if not, add it. When linked list length reaches 8 and hash table length is greater than 64, upgrade to red-black tree.
  3. When red-black tree node count is less than 6, convert back to linked list.
  4. Initial capacity 16, load factor 0.75, expand to twice the original size.
  5. Not thread-safe. During expansion, it may cause linked list cycles.

Time Complexity

Ideally, HashMap’s insertion, deletion, and search operations have an average time complexity of O(1).
If converted to red-black tree, operation time complexity stabilizes at O(log n)

Collection Interface

collection

List

ArrayList

Basic Concepts

  1. Allows null values.
  2. Memory is contiguous, saving memory space.
  3. Insertion time complexity: O(n) for insertion at the head because all subsequent elements need to be shifted; O(1) for insertion at the tail if no resizing is needed, otherwise O(n) to copy the original ArrayList and then O(1) for insertion. O(n) for insertion in the middle.
  4. Deletion time complexity: Corresponds to insertion time complexity.
  5. Not thread-safe.

Underlying Implementation

  1. Implemented by dynamic array. Random access to a node only requires calculating its address, so random access is O(1). Implements the RandomAccess marker interface.
  2. If no initial capacity is specified, the default initial capacity is 0. When the first element is added, the capacity becomes 10. Each dynamic expansion increases the capacity to 1.5 times the original.

Differences between ArrayList and Array

  1. ArrayList can be dynamically resized; Array cannot, its length is fixed and needs to be specified when initializing the array.
  2. ArrayList supports generics.
  3. ArrayList only supports objects; for primitive data types, their wrapper classes like Integer, Double are needed. Array supports both primitive data types and objects.

LinkedList

Underlying Implementation

  1. Implemented by a doubly linked list. Insertion/deletion at the head/tail are both O(1).
  2. Does not support random access. Time complexity for accessing a specific element is O(n).
  3. Performance is not as good as ArrayList, rarely used in actual development.
  4. Not thread-safe.
  5. Allows null values.
  6. Occupies more space than ArrayList because it stores pointers in addition to data.

HashMap

Implementation Principle

  1. Implemented by hash table + linked list/red-black tree. Hash collision resolution methods include chaining/probing, usually chaining.
  2. How is the hash value calculated? Get the hashCode() of the key. Perform an XOR operation between the original hashcode and the hashcode shifted right by 16 bits.
1
int hash = (key == null) ? 0 : (key.hashCode()) ^ ((key.hashCode()) >>> 16);
  1. Calculate array index. hash & (length - 1), equivalent to hash % length. Binary operations are faster than modulo operations.
  2. If a collision occurs, check if the key exists in the linked list or red-black tree. If it exists, replace the value directly. If not, add it. When the linked list length is 8 and the hash table length is greater than 64, it is converted to a red-black tree.
  3. If the number of red-black tree nodes is less than 6, it reverts to a linked list.
  4. Initial capacity is 16, load factor is 0.75, expands to twice the original size.
  5. Not thread-safe. Resizing may cause a circular linked list.

Time Complexity

Ideally, the average time complexity for HashMap’s insertion, deletion, and lookup operations is O(1). If converted to a red-black tree, the time complexity of operations will stabilize at O(log n).

Process

A process uniquely owns:

  1. PC
  2. Stack pointer (stores return addresses)
  3. Address space
  4. File descriptors

To manipulate a process, user code must invoke system calls to enter kernel mode. Always check syscall return values; -1 indicates an error.

Kernel Space

  1. Kernel space is at the top of the entire virtual address space and is shared by all user processes. User processes occupy the lower portion.
  2. As long as the system is running, the kernel remains active. It doesn’t start/stop like ordinary processes; it initializes at boot and runs until shutdown.

Process termination

A process can terminate via three paths:

Referenced technical blog
OSTEP slides
Reference

Virtual Memory

Virtualization: each process believes it has its own address space, and different processes’ address spaces are isolated from each other. Address Space

In practice, when the CPU gets a virtual address, the MMU inside the CPU translates the virtual address into a physical address (an address in main memory). Each memory address refers to a single byte.

Reference video (Chinese)

Causes of OutOfMemoryError

  1. Allocating too many objects at once → use pagination/streaming
  2. Memory leak: reachable but no-longer-needed objects still retained → find and release them
  3. Insufficient resources → use to inspect the heap

jmap is a JDK CLI tool to generate a heap dump. A heap dump is a binary snapshot of all objects on the heap. Or add -XX:+HeapDumpOnOutOfMemoryError so the JVM dumps the heap automatically on OOM.

Reference: Xiaolin Coding

Transaction Properties

Atomicity: Implemented via undo log.

Consistency: Achieved through the other three properties. Before and after a transaction, the database transitions from one consistent state to another.

Isolation: Implemented via MVCC/locking. The execution of one transaction does not affect the execution of others.

Durability: Implemented via redo log. Once a transaction is committed, its changes to the database are permanent, even in the event of a system failure or crash.

Clustered Index, Secondary Index

Clustered Index

In InnoDB, each table has a clustered index that determines the physical order of data storage on disk.

  1. If there is a primary key, it serves as the clustered index
  2. If there is no primary key, choose a unique non-null index as the clustered index
  3. If there still isn’t one, InnoDB implicitly creates an auto-increment column as the clustered index

B+ Tree

By default, InnoDB uses B+ Tree storage for both clustered and secondary indexes.

Clustered vs Secondary Indexes

Clustered Index

In InnoDB, every table has a clustered index that defines the physical order of rows on disk.

  1. If there’s a primary key, it is the clustered index.
  2. If there’s no primary key, InnoDB chooses a unique, non-null index as the clustered index.
  3. If neither exists, InnoDB creates an implicit auto-increment column as the clustered index.

B+ Tree

By default, InnoDB stores both clustered and secondary indexes as B+ Trees.

Reference: JavaGuide

ThreadLocal

The Thread class has a variable threadLocals, which can be considered a HashMap used to store thread-private content.

1
2
3
/* ThreadLocal values pertaining to this thread. This map is maintained
    * by the ThreadLocal class. */
ThreadLocal.ThreadLocalMap threadLocals = null;

The ThreadLocal class has an instance method set, which actually stores the content in threadLocals. The key is the ThreadLocal object itself.