پروژه مونت کارلو (80670))= 2864

یک سال پیش منتشر شده

تعداد بازدید: 126

کد پروژه: 511664

شرح پروژه

#پروژه_جدید
#کدپروژه : 80670
موضوع : پروژه مونت کارلو =
سلام وقت بخیر 2864
برای انجام پروژه مزاحم شدم
: سلام مجدد حدود یکماه وقت دارم
: (Your paper should at least include c)

2.     Explain sample complexity for RL Wikipedia link

3.      Explain the Double Q-learning link

4.      Explain the Atari paper(not much math, but you will have to present it) . Link

5.     Explain the Automatic Domain Randomization link(Only section 5)
: Basic rules:

1.      Report is expected to be 2-3 pages. This will include background, model/solution method studied, and the results. (The code must be appended separately)

2.     All original work should be cited, including papers, publicly available code used (website or GitHub)

3.     There will be a small interview where you will explain your work.

4.     You have the option to also do the final exam and take the better grade between the exam and the project.

5.     The grade will be assigned based on the clarity of your understanding and/or implementation. In particular, the difficulty of the problem does not contribute to the grade.

6.     In the case of a paper, a clearly formulated result and outline of its proof is expected (rather than a copy-paste from the original source!)

Suggestions:

Simple project ideas:

In what following example numbers refer to the book (SB):

Reinforcement Learning: An Introduction, Sutton and Barto, 2nd Edition. This is available for free here and references will refer to the final pdf version available here.

1.      Blackjack Example 5.3 page 99

2.     Windy gird world Example 6.4 page 130

3.     Shortest path in other simple situations

4.     Gradient Bandit Algorithm (SB 2.9) for non-stationary bandits. (Slightly harder)

5.     Use Gymnasium (a fork of openAI gym) to design an agent learn an optimal policy for one of their environments.

Somewhat more involved:

1.     Checkers see SB 16.2. The difficulty here is to construct suitable reward function (done historically by propagating a minimax tree search). The implementation of TD is fine. (Also you need some effective dynamic programming in the sense of programming)

2.     Ambitious would be to implement Backgammon

Some paper ideas:

1.     Stochastic Approximation Theory results with proofs:

a)     Robbins-Monro Algorithm

b)    Stochastic gradient method

c)     Q-learning converges to optimal.

(Your paper should at least inc
: باید یکی ازاین تمرینهایی که استاد انتخاب کردن رو ترجیحا ساده ترینش روبه زبان مونته کارلو بنویسیم
: فکر میکنم منظورش اینه

این پروژه شامل 1 فایل مهم است، لطفا قبل از ارسال پیشنهاد حتما نسبت به بررسی این فایل اقدام فرمایید.

مهارت ها و تخصص های مورد نیاز

مهندسی برق (Electrical Engineering) ریاضیات (Mathematics) مهندسی نرم افزار (Software Engineering) برنامه نویسی با R (R Programming) هوش مصنوعی (Artificial Intelligence)

مهلت برای انجام

20روز

وضعیت مناقصه

بسته

درباره کارفرما

کاربر386156

عضویت سه سال پیش

5979 پروژه ثبت شده ،

4 پروژه در حال انجام ،

0 پروژه آماده دریافت پیشنهاد ،

نرخ پذیرش پیشنهاد 25%

برای پیدا کردن پروژه‌های مشابه ثبت نام کنید و پروفایل خود را بسازید.

ورود با گوگل

یا

نیاز به استخدام فریلنسر یا سفارش پروژه مشابه دارید؟

سفارش پروژه مشابه

روش کار در پارس‌کدرز

به رایگان یک حساب کاربری بسازید

مهارت‌ها و تخصص‌های خود را ثبت کنید، رزومه و نمونه‌کارهای خود را نشان دهید و سوابق کاری خود را شرح دهید.

به شیوه‌ای که دوست دارید کار کنید

برای پروژه‌های دلخواه در زمان دلخواه پیشنهاد قیمت خود را ثبت کنید و به فرصت‌های شغلی منحصر به فرد دسترسی پیدا کنید.

با اطمینان دستمزد دریافت کنید

از زمان شروع کار تا انتهای کار به امنیت مالی شما کمک خواهیم کرد. وجه پروژه را از ابتدای کار به امانت در سایت نگه خواهیم داشت تا تضمین شودکه بعد از تحویل کار دستمزد شما پرداخت خواهد شد.

می‌خواهید شروع به کار کنید؟

یک حساب کاربری بسازید

بهترین مشاغل فریلنسری را پیدا کنید
رشد شغلی شما به راحتی ایجاد یک حساب کاربری رایگان و یافتن کار (پروژه) متناسب با مهارت‌های شما است.

پیدا کردن کار (پروژه)

تماشای دمو روش کار

پارس‌کدرز چگونه کار می‌کند؟

پارس‌کدرز خریداران یا کارفرمایان را به مجری‌ها /فریلنسرهای خبره‌ای متصل می‌کند که برای انجام پروژه آماده هستند.

پروژه مونت کارلو (80670))= 2864

برای پیدا کردن پروژه‌های مشابه ثبت نام کنید و پروفایل خود را بسازید.

نیاز به استخدام فریلنسر یا سفارش پروژه مشابه دارید؟

سری به پروژه‌های مشابه بزنید

روش کار در پارس‌کدرز