AI関連(3)
AI-SCHOLAR,
未整理
- 動画中の顔を「若返らせる」「常に笑顔にする」編集技術 イスラエルチームが開発:Innovative Tech - ITmedia NEWS
- ドローン4機と車両ロボ1機を従えてパトロール 人間とロボットの混合チームで探査できるシステム:Innovative Tech - ITmedia NEWS
- 「頭をなでる」「ビンタ」「ハグの拒否」などアバター同士の接触を滑らかにするVR技術、東工大が開発:Innovative Tech - ITmedia NEWS
- 人間の動きをロボットやほかの人に伝送!? ドコモの「人間拡張基盤」デモを見てきた(Impress Watch) - Yahoo!ニュース
- 京都から移住! 同業者も多い新天地で、京都のこってり系ラーメンを提供する「麺鍾馗」がオープン! - 吉川雅子 | Yahoo! JAPAN クリエイターズプログラム
- 写真内の物体を3Dモデルに変換するシステム、米Snapらの研究チームが開発(ITmedia NEWS) - Yahoo!ニュース
- のどの動きをMRIで解析し、口パクをテキストに変換 言語障がい者などのコンピュータ利用の一助に(ITmedia NEWS) - Yahoo!ニュース
- 写真内の物体を3Dモデルに変換するシステム、米Snapらの研究チームが開発:Innovative Tech - ITmedia NEWS
- のどの動きをMRIで解析し、口パクをテキストに変換 言語障がい者などのコンピュータ利用の一助に:Innovative Tech - ITmedia NEWS
- 東大、小声を「疑似大声」に変換する実験 大声を出す体験ができるシステムを開発:Innovative Tech - ITmedia NEWS
- 好きな重力を体験できる、ふくらはぎ装着型デバイス 他惑星の重力環境をシミュレート:Innovative Tech - ITmedia NEWS
- テキストだけで、AIが3Dモデルを自動生成 米Googleなどの研究チームが開発:Innovative Tech - ITmedia NEWS
- OpenAI、文章から画像を生成する新モデル「GLIDE」 前モデルよりも高品質な画像を生成(ITmedia NEWS) - Yahoo!ニュース
- VRキャラクターに耳を「フー」してもらえるヘッドフォン、東大が開発 風源なしで温冷風を再現(ITmedia NEWS) - Yahoo!ニュース
- 頭の形で個人認証できるヘルメット、立命館大などが開発 工事現場での作業員識別などに活用(ITmedia NEWS) - Yahoo!ニュース
- the four GAFA 四騎士が創り変えた世界 | スコット・ギャロウェイ |本 | 通販 | Amazon
- ジョジョのキャラクター風に顔写真を変換する「JoJoGAN」 1枚の画像からAIが学習(ITmedia NEWS) - Yahoo!ニュース
- PicoでとりあえずMicroPythonを動かしてみる – 楽しくやろう。
- 2021年の深層学習ハイライト(研究論文編) - Qiita
- [2112.15320] InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer
- ml-system-design-pattern | System design patterns for machine learning
- [2201.04453] Obstacle avoidance for blind people using a 3D camera and a haptic feedback sleeve
- Mercury Share Python Notebooks with others | MLJAR
- GitHub - mljar/mercury: Mercury: easily convert Python notebook to web app and share with others
- [2201.03545] A ConvNet for the 2020s
- [2202.02606] ROMNet: Renovate the Old Memories
- [2202.03026] Context Autoencoder for Self-Supervised Representation Learning
- [2202.02435] On Neural Differential Equations
- [2202.02831] Anticorrelated Noise Injection for Improved Generalization
- GitHub - gordicaleksa/pytorch-original-transformer: My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
- Computer Vision: Algorithms and Applications, 2nd ed.
- [2202.01855] Self-supervised Learning with Random-projection Quantizer for Speech Recognition
- OpenAI API - Documentation
- [2202.03376] Message Passing Neural PDE Solvers
- [2202.03314] A Robot Web for Distributed Many-Device Localisation
- accelerate/cv_example.py at main · huggingface/accelerate · GitHub
- [2112.10510] Transformers Can Do Bayesian Inference
- examples/FastStyleTransferPytorch.ipynb at master · FraPochetti/examples · GitHub
- [2107.13530] An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning
- [2202.01374] mSLAM: Massively multilingual joint pre-training for speech and text
- [2202.02474] Importance Weighting Approach in Kernel Bayes' Rule
- [2202.03038] Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry
- Release ESPnet Version 0.10.6 · espnet/espnet · GitHub
- [2106.13695] CADDA: Class-wise Automatic Differentiable Data Augmentation for EEG Signals
- [2202.02365] Marius++: Large-Scale Training of Graph Neural Networks on a Single Machine
- SberSwapで、個別の学習プロセス無しでFaceSwapを実現する | cedro-blog
- Mathematics for Machine Learning | Companion webpage to the book “Mathematics for Machine Learning”. Copyright 2020 by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong. Published by Cambridge University Press.
- [2111.07725] Investigating self-supervised front ends for speech spoofing countermeasures
- GitHub - jgkwak95/AU-GAN: Adverse Weather Image Translation with Asymmetric and Uncertainty aware GAN in BMVC2021
- GitHub - vinvino02/GLPDepth: GLPDepth PyTorch Implementation: Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth
- (13) kaz / AI Academy CEOさんはTwitterを使っています 「【Python・データサイエンス関連 100本ノック4選】 ①言語処理100本ノック https://t.co/cdnBtGcYF9 ②画像処理100本ノック https://t.co/1wZZaT0BsB ③データサイエンス100本ノック https://t.co/iTKHVljMUe ④NumPy100本ノック https://t.co/kV5SYXDQvB #github https://t.co/MfjYJwn1fw」 / Twitter
- [2202.00901] Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing
- [2202.00565] Data-driven emergence of convolutional structure in neural networks
- GitHub - clementchadebec/benchmark_VAE: Library for Variational Autoencoder benchmarking
- [2112.10510] Transformers Can Do Bayesian Inference
- GitHub - walzimmer/3d-bat: 3D Bounding Box Annotation Tool (3D-BAT) Point cloud and Image Labeling
- Release 0.21.0 · arduino/arduino-cli · GitHub
- GitHub - tsattler/visloc_pseudo_gt_limitations
- [2202.00512] Progressive Distillation for Fast Sampling of Diffusion Models
- Practical Quantization in PyTorch | PyTorch
- [2202.01841] Transport Score Climbing: Variational Inference Using Forward KL and Adaptive Neural Transport
- [2201.12765] Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets
- GitHub - Kai-46/SatelliteSfM: A library for solving the satellite structure from motion problem
- Image Similarity Challenge 2021 recap | Wide baseline stereo meets deep learning
- [P] Deep Learning for time series forecasting (neuralforecast, python package) : MachineLearning
- GitHub - skelemoa/ntu-x: NTU-X, which is an extended version of popular NTU dataset
- [2202.00164] DexVIP: Learning Dexterous Grasping with Human Hand Pose Priors from Video
- [2202.00528] Examining Scaling and Transfer of Language Model Architectures for Machine Translation
- GitHub - salesforce/BLIP: PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
- [2201.13433] Third Time's the Charm? Image and Video Editing with StyleGAN3
- Third Time's the Charm?
- GitHub - ott-jax/ott
- [2201.12324] Optimal Transport Tools (OTT): A JAX Toolbox for all things Wasserstein
- 390億から820億パラメータへ|LINEの巨大言語モデル HyperCLOVA開発の裏側 | AI専門ニュースメディア AINOW
- Googleストリートビューからリアルなバーチャル3Dシーンを自動作成 Googleとトロント大が開発:Innovative Tech - ITmedia NEWS
- Infographic: Sentiment Scale Reveals Which Words Pack the Most Punch
- シャアかグレタかクイズ | クイズメーカー - こたえてあそぶ・つくってあそぶ・クイズのプラットフォームサービス
- Aligning Language Models to Follow Instructions
- [2202.01602] The Disagreement Problem in Explainable Machine Learning: A Practitioner's Perspective
- [2108.09022] Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images
- Classification of 74 facial emoji’s emotional states on the valence-arousal axes | Scientific Reports
- 技術:顔の絵文字は人間の感情をうまく伝えている | Scientific Reports | Nature Portfolio
- 脳に貼り付ける高性能センサー、解像度は従来の100倍 将来的には無線化で埋め込みも:Innovative Tech - ITmedia NEWS
- [2201.11114] Natural Language Descriptions of Deep Visual Features
- GitHub - facebookresearch/omnivore: Omnivore: A Single Model for Many Visual Modalities
- [2201.10801] When Shift Operation Meets Vision Transformer: An Extremely Simple Alternative to Attention Mechanism
- [2105.07520] Dynamic Pooling Improves Nanopore Base Calling Accuracy
- [2201.10103] Improving non-autoregressive end-to-end speech recognition with pre-trained acoustic and language models
- [2202.04040] Self-Conditioned Generative Adversarial Networks for Image Editing
- [2202.03125] Building Synthetic Speaker Profiles in Text-to-Speech Systems
- Introducing Text and Code Embeddings in the OpenAI API
- [1809.01496] Learning Gender-Neutral Word Embeddings
- [2201.10375] Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention
- Audio samples from "Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention"
- 動画中の顔を「若返らせる」「常に笑顔にする」編集技術 イスラエルチームが開発:Innovative Tech - ITmedia NEWS
- 共同発表:ダウン症モデルラットの作製に成功~ダウン症の脳病態のメカニズム解明に期待~
- GitHub - sberbank-ai/sber-swap
- GitHub - PeterL1n/RobustVideoMatting: Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
- GitHub - chenfengxu714/SqueezeSegV3
- The shape of the output layer is different from the result of the tensorflow2onnx transformation. · Issue #25 · PINTO0309/tflite2tensorflow · GitHub
- PINTO_model_zoo/240_BSRGAN at main · PINTO0309/PINTO_model_zoo · GitHub
- [2010.07061] GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music
- Selecting molecules with diverse structures and properties by maximizing submodular functions of descriptors learned with graph neural networks | Scientific Reports
- GitHub - martinsbruveris/tensorflow-image-models: TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.
- GitHub - Vandermode/ERRNet: Single Image Reflection Removal Exploiting Misaligned Training Data and Network Enhancements (CVPR 2019)
- GitHub - zdlarr/Location-aware-SIRR: Code for the paper "Location-aware Single Image Reflection Removal"
- GitHub - svip-lab/HRNet-for-Fashion-Landmark-Estimation.PyTorch: Fashion Landmark Estimation with HRNet
- Google Colaboratory版Zundavoxで原稿とスライドを元にプレゼン動画を生成する方法 - YouTube
- zundavox v0.1a 公開版 - Colaboratory
- CVSS Dataset | Papers With Code
- PINTO_model_zoo/229_DexiNed at main · PINTO0309/PINTO_model_zoo · GitHub
- ジョイマン生成器つくってみた - Qiita
- Vector-Quantized Variational Autoencoders
- Causal Discovery for Linear Mixed Data | OpenReview
- ベータ分布の謎に迫る
- GitHub - iwatake2222/InferenceHelper: Helper Class for Deep Learning Inference Frameworks: TensorFlow Lite, TensorRT, OpenCV, OpenVINO, ncnn, MNN, SNPE, Arm NN, NNAbla
- [2202.03751] InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
- [2202.04200] MaskGIT: Masked Generative Image Transformer
- Blurry Faces - a Hugging Face Space by frapochetti
- GitHub - pytorch/functorch: functorch is a prototype of JAX-like composable function transforms for PyTorch.
- GitHub - google/jax: Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
- Building an Image Classification App with Gradio and Timm - YouTube
- [2202.04538] Generating Training Data with Language Models: Towards Zero-Shot Language Understanding
- GitHub - j-min/DallEval: DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers
- Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
- Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
- GitHub - NVlabs/instant-ngp: Instant neural graphics primitives: lightning fast NeRF and more
- Foundations of Deep Learning (Feizi)-Programming-Modules.ipynb - Colaboratory
- Google AI Blog: Nested Hierarchical Transformer: Towards Accurate, Data-Efficient, and Interpretable Visual Understanding
- [2202.03957] Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning
- GitHub - stepjam/BPP: Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning
- Michael Bronstein | Neural diffusion PDEs, differential geometry, and graph neural networks - YouTube
- [2201.03787] All-optical ultrafast ReLU function for energy-efficient nanophotonic deep learning
- Python autocompletion with a Transformer XL model - YouTube
- GitHub - labmlai/python_autocomplete: Use Transformers and LSTMs to learn Python source code
- PINTO_model_zoo/228_Fast-SCNN at main · PINTO0309/PINTO_model_zoo · GitHub
- [2202.04947] OWL (Observe, Watch, Listen): Localizing Actions in Egocentric Video via Audiovisual Temporal Context
- [2202.05083] Cross-speaker style transfer for text-to-speech using data augmentation
- [2202.04713] PINs: Progressive Implicit Networks for Multi-Scale Neural Representations
- [2202.05009] NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN
- [2201.07207] Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents
- GitHub - tensorflow/tensorflow at r2.8
- まばたきは閉じる時と開く時どちらが速い? スーパーカワイイ手乗りロボ「スタックチャン」作者ししかわさんインタビュー(深水英一郎氏寄稿) - Yahoo! JAPAN
- PINTO_model_zoo/224_Y-net at main · PINTO0309/PINTO_model_zoo · GitHub
- GitHub - JingyunLiang/VRT: VRT: A Video Restoration Transformer (official repository)
- [2109.10252] Audiomer: A Convolutional Transformer For Keyword Spotting
- [2202.05008] EvoJAX: Hardware-Accelerated Neuroevolution
- GitHub - google/evojax
- [2202.05262] Locating and Editing Factual Knowledge in GPT
- GitHub - spotify/pedalboard: 🎛 🔊 A Python library for adding effects to audio.
- [2111.10003] Differentiable Wavetable Synthesis
- [2110.04621] Universal Paralinguistic Speech Representations Using Self-Supervised Conformers
- [2110.07313] Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
- [2201.02279] De-rendering 3D Objects in the Wild
- [2201.13425] Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning
- [2201.02609] Generalized Category Discovery
- GitHub - PINTO0309/openvino2tensorflow: This script converts the ONNX/OpenVINO IR model to Tensorflow's saved_model, tflite, h5, tfjs, tftrt(TensorRT), CoreML, EdgeTPU, ONNX and pb. PyTorch (NCHW) -> ONNX (NCHW) -> OpenVINO (NCHW) -> openvino2tensorflow -> Tensorflow/Keras (NHWC/NCHW) -> TFLite (NHWC/NCHW). And the conversion from .pb to saved_model and from saved_model to .pb and from .pb to .tflite and saved_model to .tflite and saved_model to onnx. Support for building environments with Docker. It is possible to directly access the host PC GUI and the camera to verify the operation. NVIDIA GPU (dGPU) support. Intel iHD GPU (iGPU) support.
- GitHub - sberbank-ai/ru-dolph: RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP
- GitHub - pallets/flask: The Python micro framework for building web applications.
- GitHub - lucidrains/nuwa-pytorch: Implementation of NÜWA, state of the art attention network for text to video synthesis, in Pytorch
- 論文を見付けて読む癖を付けよう|Dr. Kano|note
- GitHub - facebookresearch/mae: PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
- [2111.06377] Masked Autoencoders Are Scalable Vision Learners
- GitHub - venture-anime/cartoongan-pytorch: Experimental CartoonGAN (Chen et.al.) implementation for quicker background generation for posters and new episodes
- GitHub - openvinotoolkit/openvino: OpenVINO™ Toolkit repository
- [2201.01763] Robust Self-Supervised Audio-Visual Speech Recognition
- toolbox/torch-distributed-gpu-test.py at master · stas00/toolbox · GitHub
- [2110.09485] Learning in High Dimension Always Amounts to Extrapolation
- GitHub - microsoft/HuRL: Code repository accompanying the Heuristic Guided RL NeurIPS'21 paper
- [2106.02757] Heuristic-Guided Reinforcement Learning
- [2201.00424] Splicing ViT Features for Semantic Appearance Transfer
- GitHub - omerbt/Splice: Official Pytorch Implementation for "Splicing ViT Features for Semantic Appearance Transfer" presenting "Splice"
- フロントエンドのデザインパターン
- [2112.02418] YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
- GitHub - coqui-ai/TTS: 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
- [2112.15320] InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer
- Shinnosuke Takamichi (高道 慎之介) - corpus
- GitHub - zengqunzhao/EfficientFace: “Robust Lightweight Facial Expression Recognition Network with Label Distribution Training”, AAAI 2021.
- GitHub - andrewlstewart/StereoNet_PyTorch: StereoNet PyTorch Lightning
- [2112.12938] Counterfactual Memorization in Neural Language Models
- Performance and Scalability: How To Fit a Bigger Model and Train It Faster
- Sharing custom models
- NLP+CSS 201 Tutorials | Tutorials for advanced natural language processing methods designed for computational social science research.
- One-track minds: Using AI for music source separation
- GitHub - hoffstadt/DearPyGui: Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
- ヤフーにおける自然言語処理モデルBERTの利用 - Yahoo! JAPAN Tech Blog
- キヤノン、世界初のSPADセンサー開発、量産へ…世界の画像処理技術に革新
- ディープラーニングを使って異常検知を実装するレシピの紹介 - Qiita
- PFN2年目の若手が挑む、機械学習の“ツール作り”。大事なのは想像力―【PFN】 | 若手プロフェッショナルのキャリア支援ならLiiga
- GitHub - wyf0912/LLFlow: The code release of paper "AAAI Low-Light Image Enhancement with Normalizing Flow", AAAI 2022
- PythonとguizeroでGUIアプリケーションを手軽に作ってみる - あっきぃ日誌
- PaddleSeg/contrib/CityscapesSOTA at release/2.3 · PaddlePaddle/PaddleSeg · GitHub
- OpenCVのプラグイン機能 - Qiita
- 物体検出の回転は要注意! 回転は楕円で考えよう - Qiita
- GitHub - Kazuhito00/FaceDetection-Anti-Spoof-Demo: なりすまし検出(anti-spoof-mn3)のWebカメラ向けデモ
- [2112.00054] Task2Sim : Towards Effective Pre-training and Transfer from Synthetic Data
- GitHub - ppogg/YOLOv5-Lite: 🍅🍅🍅YOLOv5-Lite: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 930+kb (int8) and 1.7M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size is 320×320~
- 高橋 かずひと | LAPRAS Profile
- NVIDIA Research's GauGAN AI Art Demo Responds to Words | NVIDIA Blog
- PINTO_model_zoo/033_Hand_Detection_and_Tracking at main · PINTO0309/PINTO_model_zoo · GitHub
- 天才たちの雑談|【特集】〚人類×テックの未来〛テクノロジーの新潮流 変革のチャンスをつかめ[PART3 コミュニケーションが生み出す力]|Wedge Online Premium|note
- 講書始におけるご進講の内容(令和4年) - 宮内庁
- 実践Data Scienceシリーズ | 書籍情報 | 株式会社 講談社サイエンティフィク
- [2201.04182] HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning
- [2007.05891] HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections
- [2110.14416] Transformers Generalize DeepSets and Can be Extended to Graphs and Hypergraphs
- 問題解決のための「アルゴリズム×数学」が基礎からしっかり身につく本 | 米田 優峻 |本 | 通販 | Amazon
- GitHub - dair-ai/ML-YouTube-Courses: A repository to index and organize the latest machine learning courses found on YouTube.
- Papers with Code Newsletter #20 | Papers With Code
- GitHub - rougier/scientific-visualization-book: An open access book on scientific visualization using python and matplotlib
- Isomorphic Labs | Blog
- GitHub - bayesoptbook/bayesoptbook.github.io: Companion webpage for the book "Bayesian Optimization" by Roman Garnett
- Opening up a physics simulator for robotics | DeepMind
- The-Art-of-Linear-Algebra/The-Art-of-Linear-Algebra.pdf at main · kenjihiranabe/The-Art-of-Linear-Algebra · GitHub
- MIT Tech Review: ディープマインドが天気予報で成果、降雨時間と場所を正確に予測
- 100+ Most Valuable Github Repositories For Machine Learning
- The Feynman Lectures on Physics Audio Collection
- GitHub - cs-books/influential-cs-books: Most influential books on Computer Science/programming
- [2108.07258] On the Opportunities and Risks of Foundation Models
- Overview — Deep Learning for Molecules and Materials
- github/README.md at main · kaityo256/github · GitHub
- [2108.05542] AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
- [2108.04526] A Survey on Deep Reinforcement Learning for Data Processing and Analytics
- [2009.05673] Applications of Deep Neural Networks
- Deep Learning for AI | July 2021 | Communications of the ACM
- [2108.02497] How to avoid machine learning pitfalls: a guide for academic researchers
- A Cephalopod Has Passed a Cognitive Test Designed For Human Children
- コードで学ぶAWS入門
- Increasing number of attempts ver. 2021 - Speaker Deck
- ミクシィの21新卒技術研修の資料と動画を公開します! - mixi developers
- 深層学習教科書 ディープラーニング G検定(ジェネラリスト)公式テキスト 第2版 | 猪狩 宇司, 今井 翔太, 江間 有沙, 岡田 陽介, 工藤 郁子, 巣籠 悠輔, 瀬谷 啓介, 徳田 有美子, 中澤 敏明, 藤本 敬介, 松井 孝之, 松尾 豊, 松嶋 達也, 山下 隆義, 一般社団法人日本ディープラーニング協会 |本 | 通販 | Amazon
- 「6年解けなかった構造があっさり」──タンパク質の“形”を予測する「AlphaFold2」の衝撃 GitHubで公開、誰でも利用可能に - ITmedia NEWS
- [2007.01547] Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
- GitHub - probml/pyprobml: Python code for "Machine learning: a probabilistic perspective" (2nd edition)
- [2106.08962] Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
- 基礎線形代数講座
- [2106.04554] A Survey of Transformers
- [2106.03253] Tabular Data: Deep Learning is Not All You Need
- 大学間コンソーシアム | 東京大学 数理・情報教育研究センター
- [2106.00123] Deep Reinforcement Learning in Quantitative Algorithmic Trading: A Review
- [2106.02253] X-volution: On the unification of convolution and self-attention
- [2105.14103] An Attention Free Transformer
- 東京大学大学院 集中講義 計算論的神経科学 - YouTube
- GitHub - dvgodoy/dl-visuals: Over 200 figures and diagrams of the most popular deep learning architectures and layers FREE TO USE in your blog posts, slides, presentations, or papers.
- [2105.08050] Pay Attention to MLPs
- 令和3年度東京大学大学院入学式総長式辞 | 東京大学
- [2103.03404] Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
- [2102.10772v1] Transformer is All You Need: Multimodal Multitask Learning with a Unified Transformer
- [2105.03322] Are Pre-trained Convolutions Better than Pre-trained Transformers?
- The Missing Semester of Your CS Education · the missing semester of your cs education
- GitHub - PKSHATechnology-Research/tdmelodic: A Japanese accent dictionary generator
- Machine Learning Datasets | Papers With Code
- [DL輪読会]MuZero:Mastering Atari, Go, chess and shogi by planning with a…
- これならわかる機械学習入門 (KS物理専門書) | 富谷 昭夫 |本 | 通販 | Amazon
- 深層強化学習入門 | Vincent Francois-Lavet, Peter Henderson, Riashat Islam, Marc G.Bellemare, Joelle Pineau, 松原 崇充, 松原 崇充, 井尻 善久, 濵屋 政志 |本 | 通販 | Amazon
- [2102.10772] UniT: Multimodal Multitask Learning with a Unified Transformer
- Synthesia - Wikipedia
- AIでプロ並みの動画を制作する「Synthesia」が累計76億円を調達 | Forbes JAPAN(フォーブス ジャパン)
- [2111.14690] DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion
- [2110.06864] ByteTrack: Multi-Object Tracking by Associating Every Detection Box
- Multiple-object tracking (MOT) アルゴリズム研究の歴史 ~DeepSORT 以後の SOTA モデル紹介~ - ACES エンジニアブログ
- GitHub - kuboshizuma/UVTextureConverter: To convert atlas texuture (defined in Densepose) to normal texture (defined in SMPL), and vice versa.
- AI vs 人間!顔認識技術の限界にチャレンジしてみた - karaage. [からあげ]
- 不鮮明な映像から人物の身長を推定し、同一人物を特定可能な技術を開発:人工知能ニュース - MONOist
- Core MLで動かそう!CNNを使った軽量で高速なオンデバイス音声認識 - Yahoo! JAPAN Tech Blog
- マスク着用時にも認証可能な顔認識モデルを作成した話 - ACES エンジニアブログ
- 自然言語系AIサービスと著作権侵害 | STORIA法律事務所
- GitHub - rwightman/pytorch-image-models: PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
- GitHub - NVlabs/imaginaire: NVIDIA's Deep Imagination Team's PyTorch Library
- Norod78/VintageStyle at main
- [2107.03065] Msdtron: a high-capability multi-speaker speech synthesis system for diverse data using characteristic information
- [2202.05416] FAAG: Fast Adversarial Audio Generation through Interactive Attack Optimisation
- [2202.05826] End-to-end Algorithm Synthesis with Recurrent Networks: Logical Extrapolation Without Overthinking
- [2202.05718] Audio Defect Detection in Music with Deep Networks
- [2202.05508] Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer
- machine-learning-notes/sklearn-categorical-numerical-mix.ipynb at main · rasbt/machine-learning-notes · GitHub
- [P] C++ Machine Learning Library Built From Scratch by a 16-Year-Old High Schooler : MachineLearning
- 先行研究被りの話
- GitHub - moskomule/jax_devcontainer: devcontainer for JAX
- NeurIPS Data-Centric AI Workshop - Speaker Deck
- 日本語自然言語処理のData Augmentationライブラリdaajaを作りました - 農園
- Data Distribution Shifts and Monitoring
- GitHub - ibaiGorordo/ONNX-PackNet-SfM: Python scripts for performing monocular depth estimation using the PackNet-SfM model in ONNX
- GitHub - autonomousvision/projected_gan: [NeurIPS'21] Projected GANs Converge Faster
- ASD者の錯視の起こりにくさについて|Masakazu Ide|note
- Webデータマイニングのトップカンファレンス「WSDM」にて共著論文採択 - Yahoo! JAPAN Tech Blog
- GitHub - google/ml_collections: ML Collections is a library of Python Collections designed for ML use cases.
- [2202.05607] Online Decision Transformer
- Tensorflowz-Js · GitHub
- GitHub - Tensorflowz-Js/pose-animator
- [2111.09883] Swin Transformer V2: Scaling Up Capacity and Resolution
- [2111.09886] SimMIM: A Simple Framework for Masked Image Modeling
- [2111.09888] Simple but Effective: CLIP Embeddings for Embodied AI
- [2111.09887] PyTorchVideo: A Deep Learning Library for Video Understanding
- [2111.09881] Restormer: Efficient Transformer for High-Resolution Image Restoration
- [2111.09876] One-Shot Generative Domain Adaptation
- [2111.09858] Successor Feature Landmarks for Long-Horizon Goal-Conditioned Reinforcement Learning
- [2111.09847] Edge-preserving Domain Adaptation for semantic segmentation of Medical Images
- [2111.09833] TransMix: Attend to Mix for Vision Transformers
- [2111.09808] Exploring the Limits of Epistemic Uncertainty Quantification in Low-Shot Settings
- [2111.09799] LiDAR Cluster First and Camera Inference Later: A New Perspective Towards Autonomous Driving
- [2111.09797] Boosting Supervised Learning Performance with Co-training
- [2111.09793] Unsupervised Online Learning for Robotic Interestingness with Visual Memory
- [2111.09779] Wiggling Weights to Improve the Robustness of Classifiers
- [2111.09748] The Way to my Heart is through Contrastive Learning: Remote Photoplethysmography from Unlabelled Video
- [2111.09740] Interactive segmentation using U-Net with weight map and dynamic user interactions
- [2111.09734] ClipCap: CLIP Prefix for Image Captioning
- [2111.09733] Perceiving and Modeling Density is All You Need for Image Dehazing
- [2111.09708] A Trainable Spectral-Spatial Sparse Coding Model for Hyperspectral Image Restoration
- [2111.09692] SUB-Depth: Self-distillation and Uncertainty Boosting Self-supervised Monocular Depth Estimation
- [2111.09642] Towards Intelligibility-Oriented Audio-Visual Speech Enhancement
- [2111.09639] Recurrent Variational Network: A Deep Learning Inverse Problem Solver applied to the task of Accelerated MRI Reconstruction
- [2111.09641] Evaluating Transformers for Lightweight Action Recognition
- [2111.09635] Automatic Neural Network Pruning that Efficiently Preserves the Model Accuracy
- [2111.09621] SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking
- [2111.09613] Improving Transferability of Representations via Augmentation-Aware Self-Supervision
- [2111.09571] Robust Person Re-identification with Multi-Modal Joint Defence
- [2111.09560] Adaptive Shrink-Mask for Text Detection
- [2111.09539] Deep neural networks-based denoising models for CT imaging and their efficacy
- [2111.09526] Learning Modified Indicator Functions for Surface Reconstruction
- [2111.09515] RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation
- [2111.09492] Reference-based Magnetic Resonance Image Reconstruction Using Texture Transformer
- [2111.09485] 3D Lip Event Detection via Interframe Motion Divergence at Multiple Temporal Resolutions
- [2111.09463] Self-Attending Task Generative Adversarial Network for Realistic Satellite Image Creation
- [2111.09451] Efficient deep learning models for land cover image classification
- スイッチサイエンス、Crowd Supplyと連携はじめました。 – スイッチサイエンス マガジン
- [2110.04374] A Few More Examples May Be Worth Billions of Parameters
- The world is a game. ゲームから自動運転そして汎用人工知能への道|山本一成🚗自動運転TURINGの人|note
- Reward is enough - ScienceDirect
- GitHub - patrick-kidger/diffrax: Numerical differential equation solvers in JAX. Autodifferentiable and GPU-capable.
- Diffrax
- Preview the Markdown rendering of gists | GitHub Changelog
- GitHub - NVIDIA/MatX: An efficient C++17 GPU numerical computing library with Python-like syntax
- [2111.08276] Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts
- [2111.08575] GRI: General Reinforced Imitation and its Application to Vision-Based Autonomous Driving
- JAXとPyTorch、どっちが速いのか検証してみた - まったり勉強ノート
- [2111.06474] AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization
- GitHub - deepmind/einshape
- GitHub - arogozhnikov/einops: Deep learning operations reinvented (for pytorch, tensorflow, jax and others)
- [2111.05992] On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning
- [2111.06394] The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos
- [2111.05754] Prune Once for All: Sparse Pre-Trained Language Models
- GitHub - yu-li/AGLLNet: Attention Guided Low-light Image Enhancement with a Large Scale Low-light Simulation Dataset, IJCV 2021.
- [1406.2661] Generative Adversarial Networks
- [1710.10196] Progressive Growing of GANs for Improved Quality, Stability, and Variation
- [1812.04948] A Style-Based Generator Architecture for Generative Adversarial Networks
- [2202.05924] Compute Trends Across Three Eras of Machine Learning
- GitHub - dair-ai/GNNs-Recipe: A recipe to study Graph Neural Networks (GNNs)
- [2202.06709] How Do Vision Transformers Work?
- [2105.08199] Randomly Initialized Convolutional Neural Network for the Recognition of COVID-19 using X-ray Images
- [2202.06438] Learning from Randomly Initialized Neural Network Features
- Artbreeder. Draw me an Electric Sheep. | by Vlad Alex (Merzmensch) | Towards Data Science
- [1910.09524] Cascaded Generation of High-quality Color Visible Face Images from Thermal Captures
- GitHub - meteorshowers/X-StereoLab: SOS IROS 2018 GOOGLE; StereoNet ECCV2018 GOOGLE; ActiveStereoNet ECCV2018 Oral GOOGLE; HITNET CVPR2021 GOOGLE;PLUME Uber ATG
- [2111.06849] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data
- [2202.06539] Deduplicating Training Data Mitigates Privacy Risks in Language Models
- [2009.04930] Orientation Keypoints for 6D Human Pose Estimation
- Sloan Research Fellowships awarded to Computer Science’s David Duvenaud and Alec Jacobson | Faculty of Arts & Science
- MOT Challenge - Data
- [2202.06417] A Contrastive Framework for Neural Text Generation
- [1812.05944] A Tutorial on Distance Metric Learning: Mathematical Foundations, Algorithms, Experimental Analysis, Prospects and Challenges (with Appendices on Mathematical Background and Detailed Algorithms Explanation)
- ailia-models/pose_estimation/movenet at master · axinc-ai/ailia-models · GitHub
- GitHub - open-mmlab/mmpose: OpenMMLab Pose Estimation Toolbox and Benchmark.
- [1908.05806v2] Cross-Domain Adaptation for Animal Pose Estimation
- ブラウザ上での姿勢推定に関する情報をいくつかざっくり見てみる(MediaPipe JavaScript版、TensorFlow.js、BlazePose、MoveNet) - Qiita
- MoveNet:超高速で正確なポーズ検出モデル。 | TensorFlow Hub
- MoveNet: A Deep Neural Network for Joint Profile Prediction Across Variable Walking Speeds and Slopes | IEEE Journals & Magazine | IEEE Xplore
- Inside MoveNet, Google’s Latest Pose Detection Model
- [2110.07641] Non-deep Networks
- GitHub - Kazuhito00/MOT-Tracking-by-Detection-Pipeline: Tracking-by-Detection形式のMOT(Multi Object Tracking)について、 DetectionとTrackingの処理を分離して寄せ集めたフレームワーク
- JAXによる微分可能Smith Watermanアルゴリズムのパフォーマンス測定 - まったり勉強ノート
- 【プレスリリース】小脳を模した光ニューラルネット回路 ~超高速・省電力の光リザバー計算チップを実現~ | 日本の研究.com
- YOLOv4 vs YOLOv4-tiny. Training custom YOLO detectors for Mask… | by Techzizou | Analytics Vidhya | Medium
- GitHub - linghu8812/tensorrt_inference
- GitHub - aqeelanwar/MaskTheFace: Convert face dataset to masked dataset
- play_with_depthai/pj_depthai_depth_by_tensorrt at master · iwatake2222/play_with_depthai · GitHub
- [2108.06084] Curriculum Learning: A Regularization Method for Efficient and Stable Billion-Scale GPT Model Pre-Training
- [2110.07858] Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation
- 安い速い旨い BigQuery の 20 の最適化法 - Qiita
- ホットリンク、R&D部部長・榊が編著した書籍『Pythonではじめるテキストアナリティクス入門』を3月10日に発売|株式会社ホットリンク
- [2202.07203] Collision-free Path Planning in the Latent Space through cGANs
- [2202.05924] Compute Trends Across Three Eras of Machine Learning
- [2010.15581] The De-democratization of AI: Deep Learning and the Compute Divide in Artificial Intelligence Research
- [2109.10686] Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
- [2202.07646] Quantifying Memorization Across Neural Language Models
- [2202.06991] Transformer Memory as a Differentiable Search Index
- International Conference on Machine Learning - Wikipedia
- Conference on Neural Information Processing Systems - Wikipedia
- International Conference on Learning Representations - Wikipedia
- [2202.07305] ViNTER: Image Narrative Generation with Emotion-Arc-Aware Transformer
- GitHub - NVIDIA/apex: A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
- [2201.04182] HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning
- HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning (w/ Author) - YouTube
- neohookean.pdf
- NVIDIA’s Stretchy Simulation: Super Quick! 🐘 - YouTube
- Wang-2021-GBS.pdf
- Is Simulating Tiny Cloth Wrinkles Possible? 👕 - YouTube
- GitHub - swghosh/DeepFace: Keras implementation of the renowned publication "DeepFace: Closing the Gap to Human-Level Performance in Face Verification" by Taigman et al. Pre-trained weights on VGGFace2 dataset.
- GitHub - PaddlePaddle/PaddleHub: Awesome pre-trained models toolkit based on PaddlePaddle.(300+ models including Image, Text, Audio and Video with Easy Inference & Serving deployment)
- [2003.07694] Parameter-Free Style Projection for Arbitrary Style Transfer
- [2103.16130] Active Learning for Deep Object Detection via Probabilistic Modeling
- [2106.10199] BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
- [2202.07728] Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis
- GitHub - RobertTLange/evosax: Evolution Strategies in JAX 🦎
- 00_getting_started.ipynb - Colaboratory
- [2202.07765] General-purpose, long-context autoregressive modeling with Perceiver AR
- 因果推論のための3ステップ入門 - Speaker Deck
- 機械学習の推論WebAPIの実装をテンプレート化して使い回せるようした
- 『機械学習を解釈する技術』の紹介 / Devsumi2022 - Speaker Deck
- LINE、音声・音響信号処理分野で世界最大の国際学会「ICASSP」にて3本の論文採択 | ニュース | LINE株式会社
- 「カメラ画像利活用ガイドブックver3.0(案)」の意見公募手続き(パブリックコメント)を開始しました (METI/経済産業省)
- 音声合成業界に激震! もはや人間の喋り声、入力文字読み上げソフトVOICEPEAKはビジネス用途でも自由に利用可能
- 新音声合成ソフトのVOICEPEAKをちょっとだけ試してみた - YouTube
- [2202.08137] A data-driven approach for learning to control computers
- [2010.09895] Multi-Window Data Augmentation Approach for Speech Emotion Recognition
- [2202.08587] Gradients without Backpropagation
- [2012.07805] Extracting Training Data from Large Language Models
- [2202.07646] Quantifying Memorization Across Neural Language Models
- [2202.08143] Bias in Automated Image Colorization: Metrics and Error Types
- [2202.07968] On loss functions and evaluation metrics for music source separation
- [2202.07816] ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech
- [2108.13320] Neural HMMs are all you need (for high-quality attention-free TTS)
- [2110.03869] Self-supervised Speaker Recognition with Loss-gated Learning
- [2110.06841] On Language Model Integration for RNN Transducer based Speech Recognition
- [2111.01007] Projected GANs Converge Faster
- Papers with Code Newsletter #25 | Papers With Code
- Questions for Flat-Minima Optimization of Modern Neural Networks
- [2111.01236] Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
- [2202.07785] Predictability and Surprise in Large Generative Models
- [2111.01253] Neural Scene Flow Prior
- 東大、理論上の存在だった「ダイアモンドの双子の弟」の合成に成功 | TECH+
- GitHub - fakufaku/fast_bss_eval: A fast implementation of bss_eval metrics for blind source separation
- [2110.05249] A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
- [2202.08474] Non-Autoregressive ASR with Self-Conditioned Folded Encoders
- [2110.06440] SDR -- Medium Rare with Fast Computations
- A brief timeline of NLP from Bag of Words to the Transformer family | by Fabio Chiusano | NLPlanet | Feb, 2022 | Medium
- DeepETA: How Uber Predicts Arrival Times Using Deep Learning
- GitHub - unifyai/ivy: The Unified Machine Learning Framework
- [2202.07206] Impact of Pretraining Term Frequencies on Few-Shot Reasoning
- GitHub - dair-ai/GNNs-Recipe: A recipe to study Graph Neural Networks (GNNs)
- GitHub - dair-ai/Transformers-Recipe: A quick recipe to learn all about Transformers
- Sharpened Cosine Similarity
- Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
- AI Overcomes Stumbling Block on Brain-Inspired Hardware | Quanta Magazine
- BentoML: Create an ML Powered Prediction Service in Minutes | by Khuyen Tran | Feb, 2022 | Towards Data Science
- Day 1 — Day 60 : Quick Recap of 60 days of Data Science and ML | by Naina Chaturvedi | Coders Mojo | Jan, 2022 | Medium
- Datasets — datasets 1.18.3 documentation
- 🏷️ Label your data to fine-tune a classifier with Hugging Face — Rubrix master documentation
- 3D volumetric rendering with NeRF
- Physics - Nanoscale Computer Operates at the Speed of Light
- [2202.07785] Predictability and Surprise in Large Generative Models
- [2202.08360] Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
- GitHub - open-mmlab/mmrotate: OpenMMLab Rotated Object Detection Benchmark
- Explore More About Excel: Analysis Toolpak | by Brant W | Feb, 2022 | Towards Data Science
- Gym | The Reinforcement Learning API
- [2202.08458] Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head
- 距離センサ入門(ステレオカメラ、プロジェクション、LiDAR) - arutema47's blog
- dToF/iToF LiDARセンサの原理や製品について - arutema47's blog
- 点群DNN、3D DNN入門 -3DYOLO, VoxelNet, PointNet, FrustrumPointNet, Pointpillars - arutema47's blog
- [2110.04385] Individualized Hear-through For Acoustic Transparency Using PCA-Based Sound Pressure Estimation At The Eardrum
- Music demixing with the sliCQ Transform [Sevag Hanssian, McGill University] - YouTube
- [2202.01094] RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
- [2111.05592] Improving the Chamberlin Digital State Variable Filter
- Three of the Most Underrated Data Science Concepts | by Tyler Buffington, PhD | Feb, 2022 | Towards Data Science
- [2202.08360] Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
- [1811.12231] ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness
- [2105.07197] Are Convolutional Neural Networks or Transformers more like human vision?
- 【論文紹介】ベイズ統計における数値計算の進歩の歴史(前編:1763年から20世紀まで) - Qiita
- [2103.02898] Fast Tucker Rank Reduction for Non-Negative Tensors Using Mean-Field Approximation
- noteの数式表現(試験運用→正式運用)|結城浩
- GitHub - deepmind/distrax
- 統計学の基礎(全50本) - YouTube
- MacやiPhoneで音声入力用の単語登録を手軽に行うRubyスクリプト(vCardデータを生成して連絡先にインポートする)
- 最終層でMIXUPしたら良さげだった件. この記事について | by Akihiro FUJII | Medium
- GitHub - rivernetio/nvidia-smi-exporter: Nvidia SMI metrics exporter for Prometheus
- Intel Will Keep Selling RealSense Stereo Cameras - IEEE Spectrum
- [2108.07258] On the Opportunities and Risks of Foundation Models
- テスラの「完全自動運転をAIチームディレクターが解説」を翻訳とともに解説 - EVsmartブログ
- github/README.md at main · kaityo256/github · GitHub
- PyTorch の argsort を sort に置き換えて ONNX にエクスポートする
- [2108.02765] Decoupled Transformer for Scalable Inference in Open-domain Question Answering
- GitHub - d2l-ai/d2l-en: Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 300 universities from 55 countries including Stanford, MIT, Harvard, and Cambridge.
- 100+ Free Data Science Books - Data Science PDF for Beginners and Experts
- GitHub - daac-tools/vaporetto: 🛥 Vaporetto: a fast and lightweight pointwise prediction based tokenizer
- TensorLy で テンソル分解 | sn42's Blog
- テンソル分解 - Wikipedia
- [2202.10435] Survey on Large Scale Neural Network Training
- inference_for_yolov5.ipynb - Colaboratory
- [2202.10261] A Self-Supervised Descriptor for Image Copy Detection
- DeepMind's AI for Mathematics Breakthrough Explained - YouTube
- [2202.11097] Message passing all the way up
- [1804.06559] SFace: An Efficient Network for Face Detection in Large Scale Variations
- [1311.2540] Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding
- [2106.01970] NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination
- [2108.01285] Toward Spatially Unbiased Generative Models
- Increasing number of attempts ver. 2021 - Speaker Deck
- [2107.12375] Geometric Deep Learning on Molecular Representations
- YOLOv4:高速で正確な物体検出器をGPU1個で推論・訓練できる実用性
- [2104.03308] Warp Consistency for Unsupervised Learning of Dense Correspondences
- アップル、iOS 15「写真」で顔が見えない人物も認識できるAI技術を解説 - Engadget 日本版
- amortized-optimization-tutorial/maxent-animation.py at main · facebookresearch/amortized-optimization-tutorial · GitHub
- [2202.10613] Gaussian Processes and Statistical Decision-making in Non-Euclidean Spaces
- Yann LeCun on a vision to make AI systems learn and reason like animals and humans
- Google AI Blog: 4D-Net: Learning Multi-Modal Alignment for 3D and Image Inputs in Time
- [2202.08938] Improving Intrinsic Exploration with Language Abstractions
- [2202.10890] Hierarchical Perceiver
- Introducing TorchRec, a library for modern production recommendation systems | PyTorch
- Probing Image-Language Transformers for Verb Understanding | DeepMind
- Let us never speak of these values again. – arg min blog
- [2202.10608v1] It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation
- Yann LeCun: AI Doesn’t Need Our Supervision - IEEE Spectrum
- Discrete Universeのつくりかた
- How AI and IoT can help business leaders in the march to a more sustainable world - The AI Journal
- 多重共線性と回帰係数の信頼性の話。あとリッジ回帰。 - Dropout
- [2106.13281] Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation
- The Four Innovation Phases of Netflix’s Trillions Scale Real-time Data Infrastructure | by Zhenzhong Xu | Feb, 2022 | Medium
- ベイズ最適化ツールBoTorch入門 - Qiita
- Influence Functionでインスタンスの重要度を解釈する - Dropout
- Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies
- [2107.05775] Fast and Explicit Neural View Synthesis
- GitHub - weihaox/awesome-neural-rendering: A collection of resources on neural rendering.
- When and how convolutional neural networks generalize to out-of-distribution category–viewpoint combinations | Nature Machine Intelligence
- [2007.08032] When and how CNNs generalize to out-of-distribution category-viewpoint combinations
- 所得分布と一般化ベータ分布のファミリー - Qiita
- Classiが取り組んできた 機械学習の試行錯誤 - Speaker Deck
- Getting Started with PyTorch Image Models (timm): A Practitioner’s Guide | by Chris Hughes | Feb, 2022 | Towards Data Science
- 用例を基にした文法誤り訂正モデルを用いた言語学習者のための解釈性 - 金子の進捗
- DL4US コンテンツ公開ページ – 東京大学松尾研究室 – Matsuo Lab
- Build a data mesh on Google Cloud with Dataplex | Google Cloud Blog
- [2202.11136] FlowSense: Monitoring Airflow in Building Ventilation Systems Using Audio Sensing
- Spherical harmonics - Wikipedia
- Statistical Rethinking 2022 Lecture 01 - YouTube
- [2109.07623] BacHMMachine: An Interpretable and Scalable Model for Algorithmic Harmonization for Four-part Baroque Chorales
- A Meta prototype lets you build virtual worlds by describing them - The Verge
- [2202.11134] ProtoSound: A Personalized and Scalable Sound Recognition System for Deaf and Hard-of-Hearing Users
- [2202.10936] A Survey of Vision-Language Pre-Trained Models
- [2202.11214] FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators
- [2202.11169] Neural Speech Synthesis on a Shoestring: Improving the Efficiency of LPCNet
- [2202.11192] Modal Estimation on a Warped Frequency Axis for Linear System Modeling
- Preferred Networks を退職します | 凡人のブログ
- Top 10 Sources to Find Computer Vision and AI Models
- [2109.11115] Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning
- [2202.01614] The RoyalFlush System of Speech Recognition for M2MeT Challenge
- [2202.12187] SonOpt: Sonifying Bi-objective Population-Based Optimization Algorithms
- Yann LeCun: "A Path Towards Autonomous AI", Baidu 2022-02-22 - YouTube
- [2202.12142] Pretraining without Wordpieces: Learning Over a Vocabulary of Millions of Words
- [2202.11712] Flow-based sampling in the lattice Schwinger model at criticality
- GPU Technology for CG/AI_No.9 - GDEP Solutions, Inc.
- pytti 5 beta.ipynb - Colaboratory
- [2202.11918] Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement
- [2202.11929] Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
- [2202.12163] Attentive Temporal Pooling for Conformer-based Streaming Language Identification in Long-form Speech
- [2202.12169] Closing the Gap between Single-User and Multi-User VoiceFilter-Lite
- [2202.11833] Near Perfect GAN Inversion
- GitHub - weihaox/awesome-gan-inversion: A collection of resources on GAN inversion.
- [2101.05278] GAN Inversion: A Survey
- 敵対的生成ネットワーク(GAN)
- CS 230 - 畳み込みニューラルネットワーク チートシート
- CS 230 - リカレントニューラルネットワーク チートシート
- CS 230 - 深層学習のアドバイスやコツのチートシート
- GitHub - grananqvist/Awesome-Quant-Machine-Learning-Trading: Quant/Algorithm trading resources with an emphasis on Machine Learning
- [2202.11823] Differentially Private Speaker Anonymization
- Python Data Science Handbook | Python Data Science Handbook
- GitHub - keras-team/keras: Deep Learning for humans
- Asset2Vec: Turning 3D Objects into Vectors and Back | by Jonathan Laserson, PhD | Towards Data Science
- Neural Instrument Cloning from very few samples
- GitHub - iaddis/metalnes: Transistor level NES simulation
- GitHub - SourMesen/VisualNes: Visual NES simulates the CPU & PPU of a NES at the transistor level.
- 258 - Semi-supervised learning with GANs - YouTube
- [2203.00555] DeepNet: Scaling Transformers to 1,000 Layers
- PFN 3D Scanner - Preferred Networks
- Advanced exploratory data analysis (EDA) with Python | by Michael Notter | EPFL Extension School | Feb, 2022 | Medium
- PyMAFで、動画から人物の3Dモデルを推定する | cedro-blog
- 13億パラメータ日本語GPT-2を使ってみる | 楽しみながら理解するAI・機械学習入門
- Grad-CAMだけじゃない画像認識におけるCAM手法を徹底解説 - ABEJA Tech Blog
- [DEIM2022] 高速な単語分割器VaporettoとパターンマッチングマシンDaachorseの紹介 - Speaker Deck
- 【TURING】End-to-Endで限定コースをぐるぐる走る機械学習モデルを作って実際に車を動かした話【自動運転】
- [2104.13621] MLDemon: Deployment Monitoring for Machine Learning Systems
- [2109.14545] A Comprehensive Survey and Performance Analysis of Activation Functions in Deep Learning
- GitHub - ibaiGorordo/depthai-android-jni-example: Android example to get the rgb and disparity images from the OAK-D device connected to a phone.
- GitHub - PINTO0309/mediapipe-bin: MediaPipe Python Wheel installer for RaspberryPi OS aarch64, Ubuntu aarch64, Debian aarch64 and Jetson Nano.
- 「赤の他人」の対義語は「白い恋人」 これを自動生成したい物語 - Qiita
- GitHub - yuyay/DEIM2022_XAI_tutorial
- GitHub - Akiya-Research-Institute/Monocular-Depth-Estimation-on-UE4: UE4 project using NNEngine and MiDaS, monocular depth estimation AI
Transformer
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Attention
MBLLEN: Low-light Image/Video Enhancement Using CNNs
BLNet: A Fast Deep Learning Framework for Low-Light Image Enhancement with Noise Removal and Color Restoration
DA-DRN: Degradation-Aware Deep Retinex Network for Low-Light Image Enhancement
Generative Query Network
Single-stage Keypoint-based Category-level Object Pose Estimation from an RGB Image
StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
Transfer Learning for Pose Estimation of Illustrated Characters
Block-NeRF: Scalable Large Scene Neural View Synthesis
Rank Minimization for Snapshot Compressive Imaging
Unsupervised Scale-consistent Depth Learning from Video
HINet: Half Instance Normalization Network for Image Restoration
Learning To Count Everything
SAFA: Structure Aware Face Animation
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Learning to Detect Every Thing in an Open World
Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation
Objects as Points
Learning Privacy-preserving Optics for Human Pose Estimation
Patches Are All You Need?
CLIPasso: Semantically-Aware Object Sketching
MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching
FILM: Frame Interpolation for Large Motion
SSAST: Self-Supervised Audio Spectrogram Transformer
Boundary-Aware Segmentation Network for Mobile and Web Applications
End-to-end Lane Shape Prediction with Transformers
SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation
Deep High-Resolution Representation Learning for Visual Recognition
Suppress and Balance: A Simple Gated Network for Salient Object Detection
Deblurring Face Images using Uncertainty Guided Multi-Stream Semantic Networks
BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models
Autoregressive Diffusion Models
Vector Quantized Diffusion Model for Text-to-Image Synthesis
AnimeGAN
AniGAN: Style-Guided Generative Adversarial Networks for Unsupervised Anime Face Generation
SSUL: Semantic Segmentation with Unknown Label for Exemplar-based Class-Incremental Learning
FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation From a Single Image
WHENet: Real-time Fine-Grained Estimation for Wide Range Head Pose
Graph2Pix: A Graph-Based Image to Image Translation Framework
Generalization in Dexterous Manipulation via Geometry-Aware Multi-Task Learning
A System for General In-Hand Object Re-Orientation
MG-GAN: A Multi-Generator Model Preventing Out-of-Distribution Samples in Pedestrian Trajectory Prediction
The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation
MOTSynth: How Can Synthetic Data Help Pedestrian Detection and Tracking?
JoJoGAN: One Shot Face Stylization
textless-lib: a Library for Textless Spoken Language Processing
TF-GAN
Score-Based Generative Modeling with Critically-Damped Langevin Diffusion
Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection
FILM: Frame Interpolation for Large Motion
scpi: Uncertainty Quantification for Synthetic Control Estimators
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
Learning Aberrance Repressed Correlation Filters for Real-Time UAV Tracking
VinVL: Revisiting Visual Representations in Vision-Language Models
YOLOX: Exceeding YOLO Series in 2021
It's Raw! Audio Generation with State-Space Models
Cyclical Focal Loss
Masked-attention Mask Transformer for Universal Image Segmentation
Human Pose Regression with Residual Log-likelihood Estimation
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses
How Do Vision Transformers Work?
RainGAN: Unsupervised Raindrop Removal via Decomposition and Composition
GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds
HITNet: Hierarchical Iterative Tile Refinement Network for Real-time Stereo Matching
TabNet: Attentive Interpretable Tabular Learning
Self-Supervised Transformers for Unsupervised Object Discovery using Normalized Cut
Neural Outlier Rejection for Self-Supervised Keypoint Learning
Self-Supervised 3D Mesh Reconstruction From Single Images
Unsupervised Learning of Action Classes With Continuous Temporal Embedding
UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
3D Reconstruction of Novel Object Shapes from Single Images
HumanGPS: Geodesic PreServing Feature for Dense Human Correspondences
BoxInst: High-Performance Instance Segmentation with Box Annotations
End-to-End Video Instance Segmentation with Transformers
Local Deep Implicit Functions for 3D Shape
Is Space-Time Attention All You Need for Video Understanding?
PatchmatchNet: Learned Multi-View Patchmatch Stereo
DataMix: Efficient Privacy-Preserving Edge-Cloud Inference
RepVGG: Making VGG-style ConvNets Great Again
A Morphable Model For The Synthesis Of 3D Faces
Do 2D GANs Know 3D Shape? Unsupervised 3D shape reconstruction from 2D Image GANs
NeRF++: Analyzing and Improving Neural Radiance Fields
Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction
pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
Geo-PIFu: Geometry and Pixel Aligned Implicit Functions for Single-view Human Reconstruction
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
CoReNet: Coherent 3D scene reconstruction from a single RGB image
Synthesize then Compare: Detecting Failures and Anomalies for Semantic Segmentation
Are Labels Necessary for Neural Architecture Search?