I document tech related List(blog, talk, slide and book)s I found interesting and useful, so I use this repo as a cloud bookmark.
Functional programming and programming in general, code architecture, engineering, machine (deep) learning, computer vision.
- About Scala and Python. ☕
- I will update when I found something interesting. 😉
- The tragic tale of the deadlocking Python queue A story of deadlocks and despair.
- Production-ready Docker packaging for Python developers
- A deep dive into the official Docker image for Python
- Testing and debugging Apache Airflow Use Pytest to test DAG integrity and unit test Airflow components locally without the need for a production system.
- Kaggle intro for API
- Packaging a python library Good practice how to package.
- Functional Data Engineering — a modern paradigm for batch data processing Use functional data engineering for data pipelines in Airflow
- Google Python Style Guide Similar to PEP8
- GitHub Actions — Makes ‘One Click To Deploy’ Feasible For ML CI/CD Pipeline Create a CI/CD Pipeline for Machine Learning Applications with GitHub Actions
- Monitoring Machine Learning Models in Production How to monitor effectively for ML models by involving system design and devops domain for model performance and logs etc.
- Model evaluation, model selection, and algorithm selection in machine learning Cross-validation and hyperparameter tuning
- FP to the Min by John De Goes Functional Scala in production without Category Theory, Higher Kind Type(HKT) etc.
- Writing a high quality data pipeline for master data with apache spark Write ETL use Apache Spark.
- Do not log Logging is a side effect.
- Beginner’s Guide To Abstraction The abstract (not type theory) and complex for code factor.
- Structural Pattern Matching Pattern matching be added to Python to create more expressive functions.
- It's probably time to stop recommending Clean Code The negative reviews about `clean code`.
- Write Like A Programmer An essay about `writing programming` more than a novel.
- What Functional Programming Is, What it Isn't, and Why it Matters FP: local reasoning and composition make for better code.
- Lambda Architecture: Design Simpler, Resilient, Maintainable and Scalable Big Data Solutions A walk through about Lambda architecture.
- Functional Code is Honest Code Programming without side effects is a honest signature.
- Most tech content is bullshit Don't consume. Create. Ask questions. Stay curious.
- What a typical 100% Serverless Architecture looks like in AWS! Best-practices serverless architecture for a web application.
- IMPORT SUGGESTIONS IN SCALA 3 Improvement for `implicit`.
- Airflow: Tips, Tricks, and Pitfalls Use factory method to run subdag.
- Data’s Inferno: 7 Circles of Data Testing Hell with Airflow CI for Airflow pipeline, from DAG integrity tests to DTAP.
- The definitive guide on how to use static, class or abstract methods in Python Object reference and (un)binding for methods
- DEVELOPERS DON'T NEED PING-PONG TABLES Opposite, they need autonomy, mastery, and purpose. Well hit!
- Can Apache Kafka Replace a Database? Evaluate Kafka to Database features, to see Kafka is a good for as a database.
- setup.py tricks Package management.
- Design Patterns and Python Flyweight Pattern Compare Flyweight methods
- Spark and Kafka integration patterns Integrate Spark Streaming and Kafka, Kafka producer for publishing results of the Spark Streaming processing.
- Spark and Kafka Integration Patterns, Part 2 Sending Spark Streaming processing results to Kafka
- Test strategies for data processing pipelines
- Functional Programming with Kafka Streams and Scala Implement Scala and Kafka Streams for data processing by producing and consuming method, and Circle Json library.
- What is the Lambda Architecture? Demonstrate how to use Lambda Architecture to design batch layer, stream layer, serving layer.
- The Three Components of a Big Data Data Pipeline Data Engineering =
Compute + Storage + Messaging
+ Coding + Architecture + Domain Knowledge + Use Cases
- Oh, All the things you'll traverse Traversable
@typeclass
of monoids and functors. - FP for the average Joe - I - ScalaZ Validation Generic Validation to compute both Success and Failure.
- Capturing data pipeline errors functionally with Writer Monads
- Functional Programming and Category Theory [Part 2] – Applicative Functors
- Improve Your Python: Metaclasses and Dynamic Classes With Type Everything has type, a type is a metaclass.
- Write Better Python Functions Signs of good function and how to refactor.
- How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh Build distributed architectures at scale by using data mesh.
- How to keep your docker installation clean? How to save disk space and safely remove docker images and volumes.
- Using Kafka as a message queue Explain how to read messages from clusters use
at least once
delivery policy and how to handle errors.
- The path forward for typing - Python Language Summit 2020 Typing in Python.
- Effect Tracking Is Commercially Worthless To FP, effect tracking is commercially worthless.
- Python + Memcached: Efficient Caching in Distributed Applications Use Memcached cache to boost application performance.
- Polymorphic Programming Polymorphic languages to abstraction to solve problem in an identical way.
- Turning Scala code into Spark Use Scala for declarative approach for Spark data processing.
- Implementing Pattern Matching in Python Using Scala's pattern matching concept to implement AST in Python.
- Covariance and contravariance in subtyping It introduces Liskov substitution principle for subtyping, creating immutableList type to implement convariance in Python. Plus one rule to follow, is `be liberal in what you accept and conservative in what you produce`.
- Some notes on the Y combinator Demonstrates Y-combinator concept with using lambda function in Python.
- Data Processing and Enrichment in Spark Streaming with Python and Kafka Batch and windowed streaming processing by using Spark and Kafka.
- Data architectures for streaming applications The architectures for stream (unbounded) processing, Scala, and cloud computing.
- Functional Programming For The Rest of Us It's a good one about the root of FP, from philosophy of maths to laws of universe, lambda calculus.
- To Iterate is Human, to Recurse, Divine Talk about beauty of code structure by control(time) and data(structure), and higher order function is an idea of
chunks of logic
. - Spark Architecture: Shuffle Hash, Sort shuffle and off-heap storage buffer explained.
- Why We Do Scala in Zalando Type system and function composition,referential transparency(RT), Monad transformer in Scala.
- typelevel-alchemist Typeclass and Cats library explained.
- Designing a Horizontally Scalable Event-Driven Big Data Architecture with Apache Spark Data ingestion with Kafka Connect and Spark Stream and toolsets for data processing and analytics.
- Monad Transformers for the working programmer Monad doesn't compose and use Cats library to transform Monad, use OptionT, EitherT and other constructors.
- Spark on Scala: Adobe Analytics Reference Architecture Spark Scala to build scalable data processing pipeline.
- Typechecking SQL queries with doobie Combine doobie (a principled JDBC layer) and Monix (asynchronous programming in Scala) to obtain fast, concise and typechecked JDBC SQL database.
- Introduction to Category Theory in Scala Categories and functors in Scala.
-
Who implements the typeclass instance? Ad-hoc polymorphism with
GADT
pattern matching. - Distributed Systems Architecture Spark Architecture, how the memory be used by total heap size, YARN resource manager works.
- Getting Func-ey Part 3 - Typeclasses
It talks about to use pattern style
implicit
to implementtypeclass
,record of functions
Haskell to think typeclass as class + method and MonadError. - Compiling to lambda-calculus: Turtles all the way down Using Church encoding to formulate function currying, destructuring match etc.
- Building an analytical data lake with Apache Spark and Apache Hudi - Part 1 Scala Spark to build data lake.
- Programming With Effects Main generic applicative methods on effects.
- Typed Tagless Final Interpreters Generics functional programming using
final tagless
pattern by Oleg Kiselyov. - Objects, Identity, and Concept-formation An eye open article about There is no such thing as Object-Oriented programming.
- Types and Functions Types are about composability, type system to design a type family (type parameters) to improve program performance.
- What makes a function pure?Pure function has feature totality, determinism, purity.
- A Beginner-Friendly Tour through Functional Programming in Scala Pure FP!
- Covariance and contravariance in Scala Concept of covariance and contravariance in Scala, type parameter for subtyping.
- A Complete Guide to Variance in Java and Scala
- Monads are Elephants Part 1 It talks about using unit and flatMap for Monad concept.
-
Thoughts On Working With Nested Monad Within The Future Monad In Scala Use sugared
for comprehension
, encode the failure in Future or Monad Transformers with Cats EitherT for monadFuture[Either[F, S]]
. - Strategic Scala Style: Principle of Least Power This is a style guidance to write clean and DRY Scala code, by using Principle of Least Power philosophy, refactoring code, don't over engineer and avoid complexity.
- What's Functional Programming All About?"FP" in software world.
- Mature Developers This article talks about the quality that mature developer should build.
- The most critical Python code metric It suggests that for dynamic code, writing tests is actually the best way to ensure your architecture is well designed.
- Tagless with Discipline — Testing Scala Code the Right Way The code discipline about how to write test and use library Discipline to write Scala test.
- Type Erasure in Scala Scala is type-safe language, the code if can compiled but can't be available to be used in run-time, for type erasure error. Using Scala reflection API to inspect types of the instance at run-time is a solution, or use polymorphism subtypes.
- The Evolution of a Scala Programmer Good examples of Scala patterns.
- A little bit of Data Science in Scala It talks how to use Scala for DS project, especially Scala collection library is easy and fast for data processing and similar to Spark API.
- Abstract Factory Factory pattern for Scala.
- Design Patterns in Scala
- Advanced multi-stage build patterns It introduces patterns how to write multiple FROM/--from flag in Dockerfile by constructing multiple stages.
- Three cool things you can do with Scalameta The usage about Scalameta library.
- Python project maturity checklist This article is written by Michał Karzyński, who gives a typical checklist how to turn Python script into a fully fledged open-source project. It covers step-to-step guide how to use setuptools, Black, pre-commit etc.
- The soul of the beast, Everything about Python's grammar Hacking the grammar, the talk can be found on this link.
- Modularity for Maintenance Never send a human to do a machine’s job.
- On Eliminating Error in Distributed Software Systems It reviews some common techniques used to eliminate errors in software systems: testing, the type system, functional programming, and formal verification.
- Production-ready Docker packaging Docker packaging guide for Python.
- My Decade in Review This article is whole story about Dan Abramov, a self taught but one of the best programmers in web dev world. Bookmark it by inspired by his open source spirit.
- Refactoring and asking for forgiveness This article talks about how to use programming pattern EAEP (easier to ask for forgiveness than permission), instead of LBYL (look before you leap).
- From Academia to Data Science Fill the gap between academic CS and industrial ML/DL.
- Putting the Power of Kafka into the Hands of Data Scientists Prototype and build data integration Highway that design self-service layer on top of Kafka, use Kafka Connect for integration.
- How to set up a perfect Python project It introduces good practice how to set up and design a Python project using Pytest for test, Black for formatting, isort for import sorting, mypy for static typing, flake8 for linting, pre-commit for Git hook that runs scripts automatically when push/commit.
- Yes silver bullet This article is delightful. Reframe essential and accidental complex in software development, quoted William Gibson's word, "The future is already here — it's just not very evenly distributed." is appropriately explain changes such as Automated testing, Statically typed functional programming.
- Modern Data Practice and the SQL Tradition Reconsider to take advantage of advanced RDBMS features (such as triggers and stored procedures) and propose to "use RDBMS in the first place" by comparing performance with NoSQL, Python library Pandas and distributed system ElasticSearch.
- Cache me if you can – 1 It talks about HTTP caching, offers an intro for why need caching (from CDN, reverse proxy part) and how it works by gaining theoretical insights about Cache-Control, Etag, If-Modified-Since and etc, analysis of the anatomy of a modern web application, to conclude caching is solution to solve the latency bottleneck.
- Static Analysis at Scale: An Instagram Story A Python use case for introducing Linting for type checking, and implement with Pyre in production.
- Scalability for Dummies - Part 3: Cache It talks about in-memory cache Redis and two cache patterns, cached database query and cached object.
- Julie Pagano: It's Dangerous to Go Alone: Battling the Invisible Monsters in Tech A Pycon talk about issue and tool for imposter syndrome in programming community.
- Introduction to Event Streaming with Kafka and Kafdrop Introduce event-driven architecture by broken down to main components broker, zookeeper, producer and consumer for pubsub system. Illustrate relation among topic, message, partition, offset, covering load balancing and delivery guarantee model. By understand better, it introduces Kafka web UI tool KafDrop.
- Tales from running Kafka Streams in Production The key components in Kafka processing layer.
- A Glossary of Functional Programming A collection of simple (but reasonably precise) pedagogical definitions for a range of functional concepts.
- Practical Change Data Streaming Use Cases with Apache Kafka & Debezium Data is core in speaker Gunnar Morling (RedHat Open Source Data Engineer) talks about how to leverage CDC for reliable microservices integration, e.g. using the outbox pattern, as well as many other CDC applications (maintaining audit logs, driving streaming queries).
- Streams and Monk – How Yelp is Approaching Kafka in 2020 Why and how Yelp redesigned their streaming infrastructure to scale clusters with a single configuration push, load balance and decommission brokers, automatically trigger rolling-restarts to pick up new cluster configuration, and more.
- High Performance Kafka Producers Explain metrics for producer configurations including compression.type, batch.size, buffer.memory and etc to gain high performance for message flow through the Producer.
- A beginner’s guide to CDC (Change Data Capture) Simple introduction for CDC from classic database trigger which is event-condition-action logic to Debezium, a new open source project, stewarded by RedHat, mentioned how to use its connector to connect DBMS to extract CDC events, propagate them to Apache Kafka in OLTP use case.
- DBLog: A Generic Change-Data-Capture Framework DBLog is a Java-based framework, able to capture changes in real-time and to take dumps, it illustrate the log/dump processing, DBMS support and other core capabilities.
- Streams and Tables in Apache Kafka: Elasticity, Fault Tolerance, and Other Advanced Concepts Confluent Kafka blog, quality is high.